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Foreword 



The issue of quality in higher education has been around for a 
long time. New institutions have sought to imitate the learning 
formata and organizational structures of "quality" institutions 
with the expectation that, by doing this, they also would acquire 
a similar reputation for quality. But many of these aspiring in* 
stitutions fail to achieve tiieir goal because they do not fully 
comprehend that quality is based on more than just format and 
structure. 

With the prediction of enrollment declines and increased fi- 
nancial pressures caused by inflation, the issue of quality has 
taken on even greater significance for every institutionj even 
those of high repute. Identifying^ nuturingj and promoting the 
distinctive qualities of an institution may mean the difference 
between survival and extinction. This means it is now crucial 
that every institution develop a more sophisticated understand- 
ing of what quality connotes and how it can be measured. 

In this report, written by Judith K. Lawrence and Kenneth 
C. Green of the Higher Education Research Institute^ LfOS Ange- 
les, California, the question of quality is examined from the 
perspective of how quality has been measured in the past. The 
authors first review studies that analyze the reputetion of ^adu- 
ate education and professional programs. From this examination 
quantiflable indicators of quality are identified and reviewed in 
light of undergraduate education. The authors conclude their 
report with a discussion of quality in relation to accreditotion 
and stete progTam review. 

For those who are concerned with identifying and measuring 
quality, this report is the first step to understanding how quality 
has been defined in the past. With this foundation the reader 
will be better able to examine and measure quality Indicators at 
his or her own Institution. 



Jonathan D. Fife 
Directo r 

iMIof Clearinghouse on Higher Education 
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Overview 



Quality , . . you know what it is, yet you don't know what 
it is. But that's self-contradictory. But some things are better 
than otherSj that is, they have more quality. But when you try 
to say what the quality is, apart from the things that have it, 
it all goes poof I There's nothing to talk about But if you can't 
say what Quality is, how do you know what it is, or how do 
you know that it even exists? If no one kno%vs what it is* then 
for all practical purposes St doesn't exist at alL But for all 
practical purposes it really doeB exist. What else are grades 
based on? Why else would people pay fortunes for some things 
and throw others in the trash pile? Obviously some things are 
better than others . , but v^hat*s the *betterness* ?, . , So round 
and round you go, spinning mental wheels and nowhere finding 
anyplace to get traction. What the hell is Quality? What is it? 
(Pirsig 1974p p. 184) 

Anyone concerned with quality in American higher education is 
caught in the quandry described by Pirsig: "What the hell is 
Quality? What is it?" We know that a degree earned at Harvard 
is different (this word, rather than **better," is used advisedly) 
from one earned at Ohio State, is different from one earned at 
Pratt Institute, is different from one earned at Antioch College, 
is different from one earned at Oral Roberts University* and so 
on, ad infinitum. 

And certainly we know intuitively, if no other way, that 
there is a tremendous range in program and institutional quality 
among the 8,000-plus colleges and universities in the U.S. higher 
education system. Yet assessments of quality by particular per- 
sons* for particular purposes. In particular nontexts, result In a 
variety of quality attributes that leave the meaning of the term 
elusive. 

Turning to the literature* this monograph deals primarily 
with the attributes of quality defined in academic studies* sep- 
arately reviewing studies of quality at the graduate level (chap- 
ter 1), In professional programs (chapter 2)* and at the under- 
graduate level (chapter 4)* and separately addressing academe's 
continuing attempts to quantify "quality^* so as to measure it 
empirically (chapter 8) rather than subjectively through repu^ 
tational ratings. In addition* chapter 5 discusses accreditation 
and state program review, both of %vhich exemplify external ap- 
proaches to assessing quality in American higher education. The 
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final chapter presents conclusions and recommendations con* 
cerning quality assessment in higher education. 

Starting with Hughes* pioneering rating study in 1925 
through those conducted under the prestigious sponsorship of 
the American Council on Education (Cartter 1966; Roose and 
Andersen 1970) , reputational ratings of graduate education have 
formed the foundation of quality assessments of academe by 
academics. Such ratings are based on peer review, wherein pro- 
grams are rated by facul^ panels in the same discipline, as ex- 
perts, and their results reflect the prominence of graduate edu- 
cation and of faculty research in the system. To the detriment 
of the undergraduate level, of the teaching-learning function, 
and of the diversity that characterizes the nation's higher edu- 
cation system, 50 years of reputational ratings h . e consistently 
identified 20 or 30 outstanding institutions, leaving them to vie 
with each other for the highest absolute rank in the hierarchial 
structure, and virtually igTioring the rest of our colleges and 
universities. 

In the domain of the top institutions— whether graduate or 
professional or undergraduate — an enormous range of material 
and human resources have been shown to correlate with reputa- 
tional prestige and with each other. Foremost among the mater- 
ial resources are institutional sim, library size, research-related 
variables such as funds available, and faculty salary. Foremost 
among the human resources are faculty and student abilities, 
background, and achievements^ — ^particularly scholarly produc- 
tivity in lending visibilily to individual scholars and their cur- 
rent institutions* In other words, material and human wealth 
tend to be concentrated In a few institutions. 

Yet this description is surely inadequate with respect to the 
meaning and measurement of quality In the nation's pluralistle 
higher education system. Indeed, the researchers have been in- 
creasingly mindful of errors of omission, errors inherent in 
prioritizing the graduate level, the top domain of institutions, 
and so forth. Moreover, they have become increasingly cognizant 
of the myriad differinces that exist in higher education— from 
domain to domain, from level to level, and from discipline to 
disciplln&— especially regarding goals and objectives* Thus, they 
continue to investigate how the perplexing concept of quality 
might be broadened to accommodate the strengths of such differ- 
ences. Clearly aware of having confused quantity with quality, 
whether through conducting reputational studies or studies based 
on quantifiable indicators or combining both, recent quality as- 
sessments show consensus on a number of needs t 
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!• If comparisons must be madij Ihey should be made be- 
tween similar types of Institutions, at the same level, in the same 
disciplines, and so forth, 

2, Quality assessments must identify progTam goals and ob= 
jectives and be referenced to them. 

3. Quality assessments must be based on a variety of attri- 
butes. 

4* The meaning of "quality" is— and should be— as varied as 
the purposes behind an assessment^ the measurement criteria 
used, and the group or groups conducting the assessment; herein 
lie the value and limitations of quality assessments, 

5, The teaching-leaming function of higher education has 
bean virtually ignored in quality assessments. Conceptually 
and methodologically, the value-added, input-environment-output 
model merits further investigation. 

The quality question will be a major concern in the coming 
decade. Quality is inextricably tied to such issues as equality of 
access and choice, post-baccalaureate employment and the value 
of a college degree, curriculum structure, and student develop- 
ment and outcomes. Only by understanding how quality has been 
assessed can we know how and in what contexts it should be 
measured and which interventions will yield improvements. 
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Reputational Studies of Graduate Education 



The best-kno^vn quality studies in American higher education are 
those sponsored by the American Council on Education (ACE) i 
Allan Cartter's Art Assessmmt of QuoXity in Gradtiate Education 
(1966) and its replication, A Rating of Graduate PrograTm 
(1970) by Kenneth Roose and Charles Andersen. These studies, 
which rank graduate programs by institution, have served as 
prototypes for quality assessments of graduate education and as 
catalysts for examining quantifiable, allegedly objective indi- 
cators of quality. As a result of their influence, the literature on 
quality in higher education informs us most about the graduate 
level, about reputational ratings, and about the correlations be- 
tween reputational ratings and quantifiable indicators (Hartnett, 
Clark, and Baird 1978). 

Ranking studies 

When Raymond Hughes (1926) conducted his pioneering repu- 
tational study of graduate programs in 1924, only 65 univer- 
sities in the U.S. awarded the doctoral degree* His study ranked 
38 universities in 20 graduate disciplines according to the num- 
ber of top scholars they employed, as listed by panels of scholars 
from each field. By 1984 the number of institutions awarding 
the doctorate had increased to 106; a second Hughes study 
(1934) ratad 59 universities in 86 fields as '^adequate'* or '*dis- 
tinguished/' according to faculty raters' assessments of staff and 
facilities for the preparation of doctoral candidates. The two 
Hughes studies went well beyond their stated purpose of inform- 
ing undergraduates about graduate programs, and established 
important precedents for quality ratings: a focus on the grad- 
uate rather than the undergraduate level, reliance on the opin- 
ions of academicians themselves rather than of outside ob- 
servers, the assigning of numerical positions (ranks) to in- 
stitutions on the basis of this "informed opinion," and an em- 
phasis on the nation's leading institutions. 

More than two decades passed before any attempt was made 
to update Hughes' work. The first cross-discipline post-war 
study, conducted by Keniston in 1957, ranked 24 graduate pro- 
grams at 25 institutions as part of a comparative self-study at 
the University of Pennsylvania (Keniston 1959). From a list of 
the 25 institutions leading in doctorate productiont the raters — 
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department chairnien ssle. ted from the inatitutioTir.l membirs of 
the American Associati r if Universities— wsre f aked to name 
the top five departments in their fields on a con jDired measure 
of doctoral program quiility and faculty quality. Keniston then 
compared his aggregated, rank^ordered list of the top 20 institu- 
tions with Hughes' results to show "what chant^ds have taken 
place In the course of a ^rpneration" (p. 116). 

The major weaknesass of the Hughes and Kiniston studies 
are summarized in the introiuction to the 1966 ACE ratings. 
According to Cartter, these earlier rankings rrfleet geographi- 
cal and rater biases, though in Hughes' case geographical bias 
may have been inevitable sinDi at that time the most distinguish- 
ed universities were conc entrated in the Northeaat and the Mid- 
west. Cartter comments on three othir flaws in the Kenlston 
study: failure to separate measures of facult.7 quality from 
measures of educational quality, failure ':e anticipate that raters 
would overrank their alma maters^ anc the choice of department 
chairmen as raters. On this last point, Carfrner asserts that 
chairmen are not necessarily the most distinguisl; -.d scholars in 
their fields; that they ard not typical of their peei.j in age and 
rank» specialization, or knowledge of the acadt;mS - scene; and 
that because they tend to be older and more conger vative, and 
thus to favor those institutions which have tradJHonally pro- 
duced the largest number of doctorate^', their ratirw:- reflect out- 
dated perceptions. 

These criticisms guided the design o: the ACE studies; great 
care was taken to achieve an equitable geographical distribution 
and to assure the representativeness of Institutions and raters. 
Cartter defends the %'alidity of using peer raters to evaluate 
graduate education : 

The preient study is a survey of informed oj inlon. The opin- 
ions we have sought are what in a cor rt cf lav/ would be called 
"the testimony of expert witnesses"— those persons in each field 
who are well qualified to judge, who by training are both 
knowledgeable and dispassionate, who through professional 
activities are competent to assess professional standards^ and 
who by their scholarly participation within their chosen fields 
have earned the respect of their colleagues and peers (Cartter 
1066, p, 8) . 

As with previous reputational sti'disE, the informative value 
of subjective ratings of quality is asi^^imrri by the Cartter re- 
port, which has three stated purposes. Tht nrst is to update the 
Hughes and Kenlston studies. V'he second purpose, directed to- 
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litical sciencs departments and callg the Somit and Tanenhaus 
study "a preview of the ACE survey" (Cartter 1966, p. 100), 

Cartttr also compared peer ratingi with io-called objective 
measurti. For InBtance, he found that the rankingi produced by 
subjective responses were consistent with the institutional rank- 
ings found by Bowker (1964), who used the enrollment of grad- 
uate award recipients in institutional programs as a criterion. 
Other quantifiable indicators used by Cartter for comparison 
with peer ratings are faculty salary, library resources, and pub- 
lication Indexes ; in each case, the results tend to corroborate the 
study-s findings* Cartter thus concludes i 

It sitms likely that If one were to include enough factors in 
constructing a so-called objective Index— allowing for varia- 
tions in institutional size and a university's commitmenti to 
certain fields of study— the results of our subjective assessment 
would be almost exactly duplicated (Cartter 1966, p, 118). 

Cartter was unwilling to aggregate departmental ratings to 
produce institutional rankings for three reasons. First, not all 
institutions offer doctorates in every field. Second, it would be 
very difficult to assign weights to various flelds. Third, depart- 
mental specialization is the chief organising principle in aca- 
deme, 

Th% academic community*s response to the Cartter report 
was ©verwhilming. The report stimulated widespread comment 
and critique, and by 1970 more than 26*000 copies had been dis- 
tributed. The 1970 Rooae-Andtfien study essentially replicates 
the 1966 ACE study* fulfllllng Cartter's commiteient to do a 
five-year follow-up itudy lest reputations become "writ in 
stone" (Cartter 1866, p. 8), Roose and Andersen (1970) assert 
that the purpose of their report Is informational; it is not in- 
tended "to inflate or deflate Institutional egos. It Is hoped that 
readers will think In terms of quality ranges rather than spe- 
cific pecJcing orders*' (p. 33), 

Thus* the Roose-Andersen report presents ranges of scores 
rather than "absolute-' raw departmental scores. In addition to 
this change, the word "quality** is omitted from the title. The 
authors state: 

Since it is evident , , , that the appraiial is of faculty and pro- 
grams as reflected by their reputations rather than as they par- 
teke of speciflc components of an amorphous attributa called 
"quality," we have resolved to uie as a title simply a descrip' 
tion of the book's contentSj A Rating of GradmtB Programs. , . 
(Roose and Anderaen 1970, p. xi). 
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The follow-up studs'' txtended the sampli from 29 to 86 fields 
and from 106 to 131 Inititutions. A total of 2,626 graduate pro- 
^anii are thus surveyed, representing a 58 percent increase 
over the number of programi rated in the 1966 report* 

Although both ACE reports explicitly and implicitly dii- 
courage the aggregation of departmental data into institutional 
"scores," other researchers ("American Education Council * , „" 
1971; Magoun 1966; Morgan, Kearney, and Regens 1976; Na- 
tional Science Foundation 1969; Petrowski, Brown, and Duffy 
1973) were quick to make such aggregations, perhaps btcause 
comparing programs and Institutions with one another is "^^ 
almost inevitable byproduct of the American competitive spirit" 
(Clark 1976, p^ 85). Moreover, researchers and government of- 
ficials find total institutional scores useful in providing a de- 
velopmental view of higher education and in facilitating national 
planning (Magoun 1966; National Science Foundation 1969; 
Petrowski, Brown, and Duffy 1973)* 

Table 1 lists alphabetically the top ten universities identlfled 
in each of the major studies discussed. Although the domain of 
graduate education has changed dramatically over the last half 
century, the same seven institutions appear at the top over the 
50-year span of the studies. It would seem that the reputations 
of Berkeley, Chicago, Harvard, Michigan, Princeton, Wisconsin, 
and Yale are secure. Nonetheleis, the very stability of the pres- 
tige of these institutions raises questions. Academe's fondness 
for ranking institutions and for focusing only on those at the 
very top betrays a lack of curiosity about educational matters, 
an indifference to any trutti that cannot be reduced to the most 
or the best 

The Roose-Andersen report indicates that the American 
Council on Education will not conduct any more studies to assess 
the prestige of graduate departments. That decision may have 
been prompted by the methodological, conceptual, and political 
complexities that surround ranking and reputational studies. 

Commentaries and critiquis 

Considerable attention has been given to the weaknesses of 
reputational ratings, especially their lack of consensus on the 
meaning of quality^ the definition would seem to vary from 
rater to rater, from program to program, and from discipline 
to discipline, making it almost Impossible to compare programs 
and institutions or to develop normative standards. The major 
Implication is that the higher education system Is too complex to 
rank on the basis of one or two dimensions* 
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Table 1 : The Top Tin Iiistitutioits Identified in Major Reputational Studies 
biititutions are listed in alphabitical order i absolute rank given in parinthiseg* 

Hughes (1926)» Keniston (1959) Cartter (1966)» Rooie-Andersan (1970)^ 



Univarsity of California 
at Berkeley (9) * 

Univarsity of Chicago (1)* 

Columbia (8) 

Cornell (10) 

Harvard (2)* 

Johns Hopkins (7) 

University of Michigan (8)* 

Princeton (6)* 

University of Wiicon- 
sin (4)* 

Yale (5)» 



Univeriity of California 
at Berkeley (2)* 

University of Chicago (6)* 

Columbia (3) 

Cornell (9) 

Harvard (1)* 

Univeriity of Illinois (10) 

University of Michi- 
gan (6)* 

Princeton (7)* 

Univeriity of Wiscon- 
sin (8)* 

Yale (4)* 



University ef California 
at Berkeley (2)* 

University of California 
at Los Angeles (10) 

University of Chicago (6)* 

Columbia (9) 

Harvard (1)* 

University of Michi- 
gan (7)* 

Princeton (4)* 

Stanford (6) 

University of Wiscon- 
sin (8)* 

Yale (8)* 



University of California 
at Berkeley (1)* 

University of Chicago (8)* 

Harvard (2)* 

University of Illinois (10) 

University of Michi- 
gan (6)* 

Massachusetts Institute of 
Technology (9) 

Princeton (7)* 

Stanford (8) 

University of Wiscon- 
sin (5)* 

Yale (4)* 



Note: An asterisk (*) indicates those institutions common to all four rankings, 

•Sourse: Magoun 1968. 

^Source: Morgan, Kearney, and Regens 1976, 
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Much of the criticism directed at reputational ratlngi con- 
cerns rater bias, which may take several forms. First, overall 
institutional reputation coupled with insufficient information 
about particular departments, may produce a '*halo effect*'; that 
iSj raters who know little about the specific department at an 
institution may rate it according to their perceptions of the 
prestige of the institution as a whol'i. Second, an institution's or 
a department's reputation may lag behind its current quality 
and practice. For instance, Cok and Catt (1977) rated psy- 
cholo^ departments on the basis of faculty's scholarly contri- 
butions to the 13 journals of the American Psychological As- 
sociation between 1970 and 1975* Comparing their results with 
the 1970 Roose-Anderson rank-ordered list of psychology depart- 
ments, they conclude that reputation does indeed lag behind 
scholarly productivity in the field of psychology and that a 
reputational survey does not adequately reflect current scholarly 
accomplishment. 

A third foriii of rator bias is ''alumni effect,'' the tendency of 
raters to give high marks to thpJr alma maters ; complicating the 
situation, the Institutions that produce the largest number of 
doctorates also produce the largest number of raters. Fourth, an 
institution's size or age may be reflected in reputational ratings, 
Finally, even though in some fields nonacademlc employers and 
other "consumer" groups may be more knowledgeable than aca- 
demicians about program quality, such people are virtually never 
used as raters in reputational surveys. 

Looking beyond methodological criticisms to more substantive 
issues, opponents of the reputational ranking approach argue 
that the results of such studies contribute little to a program's 
self-knowledge or its efforts toward improvement. Moreoverj 
they claim, by focusing attention only at the top level, reputa- 
tional studies do a disservice to Institutions and ptogr$mB at 
lower levels, and even to the higher education system as a whole 
(Slackburn and Lingenfelter 1973 1 Conference Board of As- 
sociated Research Councils 1978 1 Dolan 1976; Drew 1976^ Hart- 
nett, Clark, and Baird 1978 ; Johnson 1978a ; Wong 1977) , 

While strong objections have been expressed by many (see, 
for example, Tyler 1972), the strongest objections to reputational 
studies are expressed by W, Patrick Dolan in The Ranking 
Oame: The Power of the Acadmnic Elite (1976), Dolan criti* 
cizes the ACE studies because of their Inherent tendency to re- 
inforce the status quo and thus to impede innovation and im- 
provement. Neither ACE nor Allan Cartter sought to reform 
p'aduate education, let alone higher education, by ranking grad- 



uata departmints. Indeed, Dolan reports that ACE3 got involvid 
in "the ranking game" to forestall threatened outside activity In 
this area: If ACE had not sponsored tht 1966 Cartter study, the 
National Research Council was prepared to conduct its own as- 
sessment of graduate programs, 

Dolan is particularly skeptical about the criteria used in the 
ACE studies. The high correlations between quality of faculty 
and effectiveness of doctoral program, he says, Indicate that only 
one dimension is actually operating to detemiine the ratiflgs* Ac- 
cording to Dolan, subjective ratings of prestige necissarily rt^ 
fleet an elitist and traditionalist view of higher education, a 
view which discourages or denies diversity, especially as em- 
bodied in ixperimental programs and multidimensional ap- 
proachis. Thus, large orthodox departments are rewarded for 
their rigidity and their devotion to scholarship, while the teach- 
ing function and undergraduate education are generally Ignored. 
Dolan also believes that, since increasing consumer awareness is 
the explicit purpose of the Roose-Andersen study, student input 
should have been incorporated In the ratings. 

The careful methodolo^^ of Cartter's Initial study Is im- 
plicitly praised by emulators of his approach (sea chapters 2 and 
8) but strongly condemned by Dolan, who states that the "move- 
ment from an interesting opinion poll to the pretense of precise 
rankings ... is the most subtle and misleading transition In the 
studies" (p, 8), 

Dolan's final criticisms concern the "uses and abuses" of the 
ACE reputational ratings, especially in view of the prestige of 
the sponsoring agency and the systemwide scope of the studies. 
He fears that they may have an immeasurably adverse Impact 
on Individual institutions, administrations, state legislators, and 
even students, especially as they are used for faculty evaluation 
and resource allocation. Moreover, Dolan argues that, since the 
ACE rankings are used by many popular college guides designed 
to aid prospective students In selecting graduate schools, the 
studies may even have a deleterious effect on consumer aware- 
ness, contrary to their explicit purpose. 

By way of contrast, Blackburn and Llngenfelter (1973), in 
their literature review of reputational studies, defend the ACE 
ratings on the following grounds i 

1) Panel bias haa been largely eliminated by the careful le- 
lection procGdurei of the ACE studies; 2) subjectivity cannot 
be escaped in evaluation no matter what technique is uied; 
3) profeisional peers are competent to evaluate scholarly work, 
the central criterion in reputational studies; and 4) although 
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not a suflSeient condition of general excellence, icholarly ability 
is necessary for a good doctoral program, (p, 26) 

Proponents of quality assessments in higher education may 
not applaud attempts to rank institutions or dapartments on the 
basis of peer ratings, but they do tend to billeve that such at» 
tempts are inevitable, especially in a period when limited re- 
sourcii dictate the need for careful planning. 

Recent quality assasimenti 

In 1976, the Conference Board of Associated Research Councils 
held a planning session to lay the groundwork for anothsr peer- 
rating survey of graduate education that would closely parallel 
the ACE studies (Conference Board of Associated Research 
Councils 1978) . To answer the recurring criticism that raters 
may be inadequately informed about departments other than 
their own, the Conference Board proposed to supply information 
about each program : number of students enrolled, number of 
doctorates produced in the past three years, and names of faculty 
members. In addition to peer ratings, data on the eareer achiive- 
ments of program faculty and graduates will be collected. 

An interesting pilot study of doctoral program quality 
(Clark, Hartnett, and Baird 1976) was recently conducted under 
the joint sponsorship of the Council of Graduate Schools (CGS) 
and the Educational Testing Service (ETS)* A sample of 73 de- 
partments in three fields— 24 in psychology, 24 in chemistry, and 
25 in history — were surveyed for the purpose of exploring ways 
to assess quality. Four major findings emerged: First, dependa- 
ble and useful information about program characteristics rilated 
to educational quality can be obtained at reasonable cost and 
convenience. Second, between 25 and 30 measures are identifled 
as especially promising* Third, these measures seem to be gen- 
erally applicable across diverse fields* Finally, two clusters of 
measures emerge— "research-oriented indicators," including de- 
partment size, reputation, and physical and flnancial resources, 
student academic ability, and faculty publications; and "edu- 
cational experience Indicators," concerned with the educational 
process and the academic climate (which are rarely considered 
In quality studies), faculty interpersorial relations, and alumni 
ratings of dissertation experiences. The variables within each of 
the clusters are closely correlated with each other but variables 
from the research-oriented cluster rarely have signiflcant cor- 
relations with those from the educationaUexperiences cluster. 
Reipondents so strongly agreed on the primacy of preparing re- 
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aesrchers and ieholars that aeparate analysea baaed on diffirent 
propram goals were not possibk. 

The CGS-ETS study also aKamlnes peer ratings: their re- 
latjonahip to a broad array of prograni eharacteristici, their 
atabllity, and the ftaalbility of using them to rate aubdisciph'nes 
(Hartnett, Clark, and Baird 19T8), The peei»=ratlng component 
of the itudy is !ike that of the ACE study in that "quah'ty of 
faeulty" and "effectiveness of doetoral pro-am" are rated sep- 
arately but have a correlation of .99; the chief difference is that, 
in the CGS-ETS study, peer ratlngi were made by a larger num- 
ber of faculty memberi at a smaller number of Institutions, The 
authors report that the resultant rankings are very similar to 
the 1966 and 1970 ACE lists (with minor variationa in the rank- 
ings of piycholo^ departments). They do not, however,, name 
institutions. 

Subdiscipline ratinga, as it turns out, preient difflculties: in 
estimating the extent to which they are subject to departmental 
halo effect, in dealing logistically with the small number of 
people involved, and in avoiding the likelihood that raters will 
have insufficient information on specialties within their own 
fields. Further, correlations between departmental and subdisci- 
pline ratings are generally so high that collecting subdiscipline 
ratings in national surveys is probably not worth the trouble. 
Clearly, however, variations do exist and may be important in 
individual program evaluations. 

The CGS-ETS study used ratings from students and alumni 
(as well as from faculty) to get supplementary information on 
departments, and perhaps the most interesting finding is that 
reputational ratings bear little relation to teaching and educa- 
tional effectiveness, as revealed by the responses of these groups 
of raterSp Thus, peer ratings seem to be unaffected by the com- 
pletion rates of graduate students, student perceptions of teach- 
ing quality and of the department's concern for students, or the 
perceived degree of departmental effort toward the career de- 
velopment of junior faculty members. The authors conclude: 

Such data are useful in drawing our attention back to what 
the ratings are-peers' judgments of the quality of the depart- 
mants' faculty based largely on icholarly publications. They say 
lltWe or nothing about the quality of Initruction, the degree of 
Civility or humanenesSp the degree to which scholarly excite- 
ment is nurtured by student-faculty interactions, and so on. In 
brief, the peer-ratings are not ratings of overall doctoral pro- 
gram quality but, rather, ratings of the faculty employed in 
these programs, reflecting pHmarily their research records* No 
claim has ever been made that the ratings are more than fhh, 
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but they have often been interpreted aa being more by thoia 
who uied them, (pp* 1313-4) 

As new and multiple indicatori of quality are usid, and new 
respondent groups surveyed, the literature on asiSisinint may 
come to reflect more adequately the scope, diversity, and com- 
plexity of the higher education system, even in the graduate 
domain. 

The maiter'i degrit 

Acadimic maiter- s-level programs have been ignored in virtually 
all ranking itudles of graduate prop'ams. Providing a rare ex- 
ciption, the Carpenter and Carpenter study (1970) asked the 
deans of 44 accredited library science schools to rate the overall 
quality of master's programs, as will as doctoral programs, 
using Cartter*s flvt-polnt scale* 

When one considers that 811,620 master's degrees were 
awarded in 1978 and 815,090 are projected for 1980 (National 
Center for Education Statistics 1980), this lack of information 
on the quality of the degree seems a decided embarraBsment* 
Several explanations can be offered, however, for the tendency of 
academic researchers to ignore the master's degree. First, aca- 
demic departments tend to regard receipt of the master's degree 
as a step toward the doctorate rather than as a discrete event. As 
such, the quality of the master's degree is closely linked to the 
quality of the doctoral program that awards it. Second, some ob- 
servers (e,g., Dressel and Mayhew 1975* Leys 1956) suggest 
that the master's degree often serves to screen students for ad- 
vancement to doctoral candidacy; those students deemed unable 
to complete the doctorate are awarded the master's degree and 
gracefully eased out. Finally, in many fields (e,g., education, 
social work), the master's degree constitutes a license to prac- 
tice. Any attempt to rank master's programs in these flelds 
would be confounded by the dual academic and professional 
orientation of such programs (sit Glazer 1975). 

Under the auspices of the Council of Graduate Schools, 
attention is being given to the assessment of quality at the 
master's level through the use of the CGS^ETS multidimensional 
Instrument discussed earlier in this chapter (Clark, Hartnett, 
and Baird 1976), At the 18th Annual Meeting (Council of 
Graduate Schools 1978), the responses of 78 COS member in- 
stitutions to a survey questionnaire were analysed to deter- 
mine the usefulness of individual items on the CGS-ETS instru- 
ment in six areas 1 faculty training and performance, student 
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ability and experitnGes, phyiical and financial resources, judg- 
ments about the learning environment, judgments about aca- 
demic offerings and procedures, and accomplishments of recent 
CTSduates. Clark's presentation at the meeting points out that, 
though evaluations of master's programs have only recently be- 
gun^ three themes have already emerged from discussions of 
pro-am reviews 

1. Reviews of graduate programs need to be multidimensional, 
f oing well beyond counting number of degrees granted or com- 
paring reputational ratings, if they are to reflect the com- 
plexity and variationi of graduate education. 

2. Graduate programi should be reviewed in relation to their 
differing purpoies, such ai preparing researchers or practicing 
profeisionali, meeting local or national manpower needSp or 
preparing students for doctor's or master's degrees. 

3. Program reviwa should lead to the improvement of pro- 
gram quality, rather than focusing entirely on external de- 
mands for program accountability, (pp, 213-4) 

Emerging interest in the master^s de^ee may be explained 
by the increasing consumer orientation of evaluations of higher 
education. After a history of neglect, the master's degree may 
benefit from what has been learned from quality assessment at 
the doctoral levsL 

Summai^ and concluiioni 

Reputational studies— with their focus on faculty prestige as 
perceived by faculty raters, their preoccupation with graduate 
education and research-related characteristics, and their reliance 
on similar criteria and methodologies from one survey to the 
next— have dominated quality assessments of higher education, 
especially since publication of the first ACE report in 1966. Like 
disciples following a religious leader, later ressarchers seem un- 
willing to question or try to improve on the work of Allan Cart- 
ter, even though the same small group of nationally known, long- 
established, resource^rich universities keep appearing at the 
apex of the pyramid. 

The unfortunate consequences of this situation are perhaps 
more attributable to the higher education commuity's competi- 
tiveness, the mass media's lust for sensational headlines, and the 
American public's obsession with knowing who's at the top, than 
to any fault of the studies themselves. Despite their repeated 
cautions against aggregating departmental scores to produce in- 
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atitutional scores and thair constant reminders that the ratings 
rapresent the subjective judgments of faculty and that they 
probably reflect prestige rather than quality, scorei do get ag- 
^egated, initltutlons do get compared with one another^ and 
high prestige is translated to mean educational excellence. 

As a result, research and scholarly productivity are empha- 
sized to the exclusion of teaching eflfectiveness, community serv- 
ice, and other possible functions; undergraduate education is 
denigrated; and the vast number of institutions lower down in 
the pyramid are treated as mediocrities, whatever their actual 
strengths and weaknesses. 

On the other hand, considering the extent to which the U*S. 
higher education system has expanded and diversified over the 
past two decades to accommodate the swelling enrollments 
caused by both the po^t-World War II baby boom and the grow^ 
ing demand for postsecondary education, the need to identify 
and distinguish high-quality programs and institutions is great. 
The threat of retrenchment in response to shrinking enrollments 
and tighter resources makes this need even more urgent. Policy- 
makers facing difflcult decisions must know what constitutes 
quality in higher education; in particular, they need to have bet- 
ter information about those programs and institutions that are 
lower down in the prestige pyramid and thus often fail to be 
covered in the results of reputational studies* 
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Asiessments of Frofassional Frogram Quality 



The ACE Feputational studiea (Carttir 1966 « Boost and Ander^ 
sen 1970) were criticized by some commentators (e*g„ Petrow- 
ski, Brown, and Duffy 1973) for not including profeisional pro- 
-ams (though both itudiei did rate p'aduati programs in 
chemical, civil, electricali and mechanical englneirlng) . The 
omiiiion waSi howeverp deliberate and can be attributed to two 
factors. First} professional education is not a apeciflc concern of 
the American Council on Education, Second^ except for engi- 
neering and applied sciences, professional education was not a 
direct beneflclary of the 1968 National Defenie and Education 
Actp which funded the flourishing academic enterprise of the 
1960s, thirtby providing the impetus for the AGE studies. 

Now, however^ the climate has changed. Students manifest a 
"new vocationallsm," evidenced In their choice of majors, their 
aspirations for professional degrees, and their pra^atic atti- 
tudes and values (Astin, Kingt and Bichardson 1979), Enroll- 
ments in many academic disciplines decline, while applications 
to professional acbools soar* Consequently^ the perceived need to 
rate professional pro-ams and to identify the "top-quality" 
schools grows more Imperative. MarguUes and Blau (1978) 
summarise the situation . 

As profeiiional Jobs become scarcer and employers more se- 
lective in choosing applicanti, the differences among profes- 
sienal schoola^ln their quality and in thetr other character- 
istics—are of growing coniequence. Since ftolderi of master^s 
and doctoral degreei have proliferated in the labor market, 
where he has come from rather than the degrie itself i may 
present an increasingly powerful pasBport to entry into pro- 
fesiions* (p, 21) 

As In the graduate domain, the peer-rating, reputational ap- 
proach to quality has so far dominated assessments of pro- 
fessional education, Furtiieri the methodology of the Cartter and 
the Boose-Andersen studies Is the pervasive model for these as- 
sessments. This chapter discusses the two kinds of ranking 
studies, categorized according to their source: those conducted 
by the academic community and usually Involving assessment of 
professional schools in several flelds, and those conducted by the 
professions themselves and limited to programs within the 
single professional field. 
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studies By thi academic community 



An early raiiking study of profiieional educEtion was doiii by 
Margulies and Blau (1973) and graw out of a larger study of 
organizational structures (Blau 1973). Programs In 17 profes- 
sional fields were ranked on the basis of the number of times 
that respondents-^the deans of professional sahools— named an 
institution (including their own) as among the top flve in their 
field. The Margulies-Blau ratings received widespread attention 
and criticism, much of the latter being methodological i The 
overall r^ponse rate to the survey was only 86 percent i and in 
8 of the 17 fields^ institutions were ranked on the basis of re- 
sponses from fewer than 20 raters, 

A year later, the same two researchers received funding from 
tiie National Science Foundation and the Russell Sa^ Founda- 
tion to replicate their reputatlonal ranking itudy~using the 
same 17 profeisional fleldi plus music— with "the aim of maxi- 
mizing the response rate and thereby increasing the rellabili^ 
of the rankings" (Blau and Margulies 1974-7B, p. 43), In this 
s^ond study, self-ratings were excluded, and the response rate 
was increased to 79 percent^ The list of leading professional 
schools, however, remained virtually the same in all fields. 

Moreover, Blau and Margulies found ftat their rankings of 
the top five institutions a^eed with the rankings found in two 
other reputational studies: the first a ranking study of library 
science programs in which practicing professionals were used as 
raters (Carpenter and Carpenter 1970) and the other a (then- 
unpublished) ranking study of medical schools in which faculty 
members were used as raters (Cole and Lipton 1977). Blau and 
Margulies (1974-^76) conclude that ^^the reputations of profes^ 
sional schools in different areas and among different groups of 
professionals appear to be sufficiently similar to make overall 
ratings of their reputations meaningful" (p. 46), 

The same study looked in more detail at seven fields to see 
how professional reputation was related to financial resources 
(total institutional budget and professional school budget) and 
to "academic climate" (number of books in the library of VhB 
professional school and of the institution). As in studio of the 
correlates of the reputational guali^ of academic departments 
(see chapter 1), the slie of an Institution's library was found to 
be generally highly correlated with the reputation of its profes- 
sional schools* Findings with respect to financial resources and 
size of the professional school library were not so clear-cut* In 
professional education, reputations "depend on different condl^ 
tions in different types of professions" (p. 46), 



In r€Sponsa to charges leveled against thsir first study that 
rankings "engender invidious comparisions and hurt many ^ad- 
uate achools that may not be at the very top of their fleld" (Blau 
and Marguliis 1974-76, 42), the authors contend that since, 
"after all, professional schools do differ in quality, and these dif- 
ferencis concern people becoming afflliated with them, , . . pro- 
viding information about such differences is a public service" 
(p, 42), This argument does not really answer the charge: ITie 
reputation of a professional school may Indeed be harmed by 
omission from a list of top-rated institutions. Moreover, given 
the recent proliferation of terminal, professionally oriented pro- 
grams and the growing diversity of professional education, as= 
sessments of quality in this area need to be based more firmly 
on considerations of possible differences in the goals of different 
professional programs* Referring specifically to the 1978 Mar- 
gulies-Blau study, Dolan (1976) comments further on the prob- 
lem: 

Once again, tte assumption that there is a single continuum 
of quality from the top to the bottom underliei tiie interpreta- 
tion. There is no recognition of the fact that quite possibly 
professional schools with quite different missions and diverse 
go^s could and should be possibte, so that quality would mean 
quite different things in each of the diverea categories, (p 98) 

Blau and Margulies (1974-76) express surprise over their 
failure to find strorig sipiiflcant correlations between the reputa- 
tions of professional schools and the reputations of institutions 
in which they are located, (They do not specify how they de- 
rived measures to test these relationships, saying only that 
measures of institutional reputations are based on the Boose- 
Andersin ratings.) Outside tiie top five schools, correlations 
range from only 48 to .85, The obvious qu^tion here is: Why 
should the reputation of a professional school be similar to the 
reputation of the parent institution? This question bwomes 
especiaUy pertinent whan one recalls that "scores" for institu- 
tions were derived by aggregating scores for academic depart- 
ments. Moreover, to anticipate that the reputation of the profes- 
sional school and of the institution will be closely correlated is 
to overlook the lesson of the data: That quality should be as- 
sessed by field specialiiation, both in professional propams and 
in academic disciplinei^ 

Disturbed over the poor showing of Califomia-s professional 
schools in the two studies by Blau and Margulies, the Regents of 
the University of California commissioned Allan Cartter to con. 
duet a ranking study of professional pro-ams. After Cartter'i 
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death in 1976, the results of the study — which ratid programs in 
law, education, and busineis— appeared in Change magazine 
(Cartter and Salmon 1977) and in the UCLA Educator (Mun- 
son and Nelson 1977)* Comparing the Blau-Margulies and the 
Cartter itudiei, Muneon and Nelson conclude that differencea 
between the two lets of rankings can probably be attributed to 
differences In sample size and selection and in the survey in- 
strumints : Blau and Margulies used the deans of profesiional 
schools as raters, asking them to name, by fecall, the top five 
programs in their fields i Cartter, consistent with his 1966 ACE 
study, used deans and faculty members as raters, providing them 
with a list of institutions to rate on a five=point scale for quality 
of faculty and on a four=point scale for attractiveness of pro- 
gram (as well as requesting that they indicate how familiar they 
were with each professional school listed and whether they ex- 
pected any significant improvement in program). Perhaps the 
most significant contribution of the comparison is that It helps 
to clarify the differences between recall-based and recognition- 
based ratings: Providing raters with a list of schools (recogni- 
tion) increases the number of contenders for the top and reduces 
halo effect and alma mater effect (Munson and Nelson 1977) * 
whereas asking raters to name schools (recall) reduces the pos- 
sibility of prejudicing them by suggesting answers on the survey 
instrument. These authors further suggest that assessments of 
profesiional programs should be done by complete sample of 
experts , , . deans, faculty members, and students at the colleges 
and universities that supply students to the professional schools 
being rated, plus the prospective employers" (p, 42)* 

Another problem connected with assessment of professional 
education has to do with time lag. In a rating study of programs 
in edut ional admlmstration, Gregg and Sims (1972) found 
**quality of studente and graduates" to be the major attribute 
associated with quality by 726 department chairmen, senior 
faculty, and junior faculty. Yet, as Blau and Margulies (1974- 
75) point out, "the fruits of professional training become ap- 
parent only years after p'aduation, so that the quality of a 
aehoors program today would have to be Judged by the work of 
its ^aduates in the 1980s or 1990s" (p. 42). 

Though the major product of the Gregg and Sims study 
(1972) just mentioned was its list of the top SO of 80 educa- 
tional administration programs, respondents were asked to indi- 
cate what factors they believed determine the quality of an 
education pro-am i most frequently mentioned waa "the provi- 
sion of relevant educational experiences In the form of intern- 
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ships and field studies" (p. 82), Reviewing their results, Gregg 
and Sims assert that *"a relatively common value system char- 
acterizes scholars in the field of educational administration," 
concluding that future reputational studies of this type ^-sliould 
utilize appropriate samples of different gi^oups rather than entire 
populations" (p. 91), Their findings further suggest that the 
values underlying professional education differ significantly from 
those underlying graduate education in academic disciplines. By 
focusing clearly on the link between education and work, the 
Gregg-Sims study also underscores the importance of referencing 
quality assessments to the particular field under study, since 
educational goals and objectives vary substantially from one 
profession to another. 

Studies by the professions 

Not all ranking studies of professional education have been con- 
ducted by academics; some have been carried out within the 
professions themseh^es. 

In the field of business, for instance, the staff of MBA maga- 
zine has conducted two reputational ratings: The first, in 1974, 
used the recall method ("The 15 Top-Ranked Business Schools 
in the United States, ' 1974) ; the second, in 1975, used recogni- 
tion (*-The Top 15,- ' 1975). Since the list of top institutions was 
the same in both studies, the attention that the 1975 report gives 
to slight changes in absolute rankings seems excessive. 

Other examples of quality assessment by the professions 
themselves come from the field of law. In 1976, the staff of 
Jmis Doctor magazine surveyed the deans of 167 American law 
schools and readers of the magazine, asking them, to list by 
recall the top law schools in the country. Responses were re- 
ceived from 58 deans and 1,300 readers. "The results of both 
polls show clearly that most readers and deans can agree on a 
group of approximately 20 law schools that today enjoy the high- 
est reputations in the country" ("The Popular Vote: Rankings 
of the Top Schools," 1976, p. 18). At the same time, the report 
is full of caveats about possible sources of bias that echo criti- 
cisms leveled at academic reputational ratings: Respondents may 
not be familiar with law schools other than their alma maters 
and their employers; they may over-rank their own law schools; 
larger institutions produce more graduates and perhaps more 
of the magazine's alumni readers; the measurement criterifi 
("academic quality" and **value in landing good jobs") are 
vague and subject to various interpretations; numerical ranks 
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are too absolute, in that a schoors being ranked (say) fourth 
rather than first or second takes on a greater importance than 
is appropriate; and a program's reputation necessarily lags be- 
hind its actual quality. 

Quantiflable indicators have also bean used to rate la%v schools. 
For instance. Lewii (1974) looked at the holdings of law school 
libraries; although he does not provide a ranking per se, his data 
could be used to construct one. Similarly, Kelso (1975) included 
number of volumes in the library as one variable in a *-rasources 
index" for ranking law schools; other measures used in the index 
%vere the numbers of students and faculty and the ratio of stu- 
dents to faculty, of students to library volumes, and of faculty 
to library volumes* 

Summary and conclusions 

As reputational ratings of quality have focused on the domain 
of professional education, two points have emerged. 

First, just as in the graduate domain, although the absolute 
ranks of institutions vary from one study to another, traditional 
reputational assessments have consistently identified the same 
professional schools at the top. The Blau-Margulies (1974-75) 
and the Cole-Lipton (1977) rankings of medical schools have in 
common eight institutions amonff those at the top; three rank- 
ings of law schools (Blau and Margulies 1974-75; Cartter and 
Solmon 1977; '-The Popular Vnte^ Bankings of the Top Schools,*' 
1976) share seven "top" institutions; and three rankings of 
business schools (Blau and Margulies 1974-75; Cartter and Sol- 
mon 1977; "The Top 15," 1975) share six. 

Second, reputational studies of professional education re- 
quire different groups of raters, different criteria, and a differ- 
ent time frame than are usually used in reputational studies of 
academic disciplines. Further, differences among the professions 
themselvei must be taken into account in designing methodoio- 
gies for rating professional schools. 

The drop in the numbers of the college-age population, the 
resultant decline In postsecondary enrollments, and the unfavor- 
able academic job market portend that— despite the "new voca- 
tionalism" of today's student^professional schools, along with 
the rest of the higher education community, face difficult re- 
trenchment decisions in the near future. As supply and demand 
come into closer balance, professional education can benefit from 
the hard-learned lessons that emerge froni studies of quality In 
the academic disciplines. 
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Quantifiable Indicators of Quality 



In thsir ctastless quest for objectivity in assessments of quality 
in higher education, researchers have explored a variety of 
quantifiable indicators, used singly and in combination. This 
chapter looks first at those quantiflable indicators found to be 
correlates of prsstlge, as indicated by reputational ratings; then 
at what is probably the most common quantifiable Indicator, the 
scholarly productivity of the faculty i and finally, at a numbtr of 
other quantifiable indicators that have been txamined in dif- 
ferent studies. 

Correlates of prestige 

Even before the ACE studies, quantitative indexes had been used 
to rank Institutions (Bowker 1964 1 Eells 1960; Somit and Tan- 
enhaus 1964; Wanderer 1986) ; and many such studies done lattr 
(e.g., Kraus© and Krause 1970; Packer and Murdoch 1974 1 
Walsh, Feeney, and Resnick 1969) do not Involve comparisons 
with ACE ratings. 

Anticipating the results of subsequent research based on 
quantiflable indicators, Cartter (1966) wrote in the introduction 
to the first ACE study j 

No single index— be it hize of endowmant, number of books In 
the library, publication record of the faculty, level of faculty 
ialarieSj or numberi of Nobel laureatei on the faculty, Gug- 
genheim fellows, member of the National Academy of Sciences, 
NationEl Merit scholars in the undergraduate college or Wood- 
row Wilson fellows in the graduate school— nor any combina' 
tion of measures is aufflcient to estimate adequately the true 
worth of an educational institution, . . . 

The factors mentioned above are often referred to as **ob- 
jective" meaiures of quality. On reflection, however, it is evi- 
dent that they are for the most part "subjective" measures 
once removed. Distinguished fellows, Nobel laureates, and Na- 
tional Academy members are selected by peer groups on the 
basis of subjective assessments, faculty salaries are determined 
by someone's subjective appraisal, and endowments are the re- 
sult of philanthropic judgments. Number of volumes in the li- 
brary, though more readily quantifiable, is a factor of little 
value in measuring institutional resources unless one can make 
a qualitative judgment about the adequacy of the holdings, 
(p. 4) 
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Despite Cartter-s comments about the ultimate subjectivity 
of all quality measures, the ACE ratings prompted a number of 
ressarchers (Adams and Krislov 1978; Clark, Hartnettj and 
Baird 1976; Clemente and Sturgis 1974; Cox and Catt 1977; 
Glenn and Villemez 1970; Hurlbert 1976; Johnson 1978a; Knud- 
sen and Vaughan 1969; Lewis 1968; Siebrlng 1969) to raiik in- 
stitutions on the basis of allegedly objective measures and to 
compare their results with those of the Cartter (1966) and the 
Roose- Andersen (1970) surveys. 

The list of quantifiable measures of human and material re- 
sources that correlate with reputational prestige is enormous. 
Generally, reputational peer-rating studies reflect research-re- 
lated variables (Clark, Hartnett, and Baird 1976). Moreover, 
size alone is a slgniflcant correlate (Elton and Rogers 1971 ; El- 
ton and Rose 1972; Hagstrom 1971), and size is closely connect- 
ed with research productivity, which also correlates with reputa- 
tional peer ratings (Drew 1976; Guba and Clark 1978; Knudsen 
and Vaughan 1969; Wispe 1969). Publication productivity alone 
is a strong correlate of prestige in some fields (Cartter 1966; 
Lewis 1968). The prestige of the doctorate institution is closely 
related to faculty mobility and employment (Crane 1970; Shi- 
chor 1970) and to faculty salary (Adams and Krislov 1978; 
Muffo 1979). 

Though the magnitude of all these correlations varies acroas 
disciplines, it is generally high enough to suggest that further 
studies of the relationships between peer ratings of the top do- 
main of the higher education system and the quantifiable indi- 
cators mentioned above would be a waste of time. 

In a substantial literature review of quality assessnient of 
doctoral programs^ Blackburn and Lingenfelter (1973) sum- 
marize the findings of studies that attempt to relate reputa- 
tional ratings to quantifiable indicators. Warning that correla- 
tion does not equal causation, the authors list 15 items that are 
correlated with the 1966 Cartter ratings, as identified primarily 
by a National Science Foundation study (1969) : 

1. Magnitude of the doctoral program (number of degries 
awarded) , 

2. Amount of federal funding for academic research and de- 
velopment. 

3. Non-federal current fund income for educational and gen- 
eral purposes. 

4. Baccalaureate origini of graduate feriowship recipients 
(NSF fellowships). 

5* Baccalaureate origins of doctorates. 
0. Fraihman admiisioni selectivity. 
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7. Selection of institutiona by recipents of graduate fellow- 
ships (NSF fellowahips) . 

8. Postdoctoral students in science and eagineering* 
9* Doctoral awards per faculty member. 

10. Doctoral awards per graduate student. 

11. Hatio of doctorate to baccalaureate degrees. 

12. Compeniation of full professors. 

18. The proportion of full professors on a faculty* 

14. Higher graduate student/facutly ratios. 

15. Departmental size of seven faculty members or mora . . * 
(this finding is not a strict correlate calculated from 
median scores.) (Blaekbuni and Lingenfelter 1973, p. 11) 

In a study involving a sample of 125 mathematics, physici, 
ehemistTy, and biolo^ departments, Hagstrom (1971) found 
itrong signiflcant correlations bet%veen departmental prestige 
(as measured by the Cartttr ratings of faculty quality) and a 
number of quantifiable indicators. Including department slie 
(number of faculty members), research productlvityj research 
opportunities, faculty background (including prestige of the 
doctorate-granting institution), student characteristics (includ- 
ing number of postdoctoral fellows and undergraduate selectiv- 
ity), and faculty awards and offices. Of special interest is the 
finding that department size alone accounts for almost one'third 
of the variance in Cartter-s prestige rankings in the disciplines 
under consideration. 

Scholarly productivity as an indicator of quality 

Perhaps the most commonly used quantifiable indicator is schol- 
arly productivity of faculty. Despite its popularity, however, the 
use of this measure is fraught with difflcultiea. Sixteen years 
ago, Somit and Tanenhaus (1964) noted that the relatively poor 
Publication records of faculty members at lower^ranking institu- 
tions may in part be attributable to their heavier teaching loads, 
lack of access to adequate library facilitieSj and other such con- 
straints. It does not necessarily foIlo%v, however^ that these fac- 
ulty members are deficient when it comes to training their stu- 
dents in research and scholarship or that the institutions them- 
selves are deficient in teaching and public service. 

Even more important, publication productivity may be caus- 
ally related to prestige and thus unsuitable for use as an inde- 
pindent criterion against which to validate reputational rank- 
ings. As Lewis (1968) puts it^ "Publication in the leading 
journals places the name of the institution in the public eye, and 
it is from continually seeing the name of the institution that 
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others grant ii high prestige" (p. 181). The causal connection 
may also run i* the other direction; that is, the %vork of faculty* 
members from restigious institutions may be more readily ac- 
cepted for pub: ation. Thus, according to Dolan (1976), "the 
journal publication process itself [is] an exact mirror of the 
politics of academic prestige trumpeted by the ACE rankings'' 
(p. 42). 

A pervasive issue in connection with measures of scholarly 
productivity is the confusion of quantity of publication with 
Quality, either of the actual product or of its publication source. 
Frequency tabulations, even those that differentially weight dif- 
ferent kinds of publications (%vith books counting for more than 
journal articles, etc.), leave much to be desired* As Smith and 
Fiedler (1971) note, such frequency tabulations make no dis- 
tinction between worthwhile and inferior books or papers or 
between prestigious and inferior journals. Yet those studies that 
attempt to make such a distinction (e.g., Dre%v and Karpf 1975) 
have been criticized for excluding new fields and new publica- 
tions unless they are frequently updated (Brush 1977). Smith 
and Fiedler also note that scholarly productivity is a more ap- 
propriate criterion for some disciplines than for others. Further, 
they suggest that most attempts to judge the quality of an 
individuars published work are either superficial, since **rarely 
is the rater fully acquainted with an inc'ividuars writing, it is 
more unusual for a rater to have read most or all of a scholar's 
publications'' (Smith and Fiedler 1971, p. 226) or loglstically 
impractical, since a thorough reading and content analysis would 
simple require too much time. Moreover, an individuars produc- 
tivity changes over time (Bayer and Dutton 1977); men and 
women differ in their motivations and activity patterns and 
hence in their scholarly productivity (Astin and Bayer 1972; 
Bayer 1978). Overall, women are concentrated in teaching- 
oriented institutions: 68 percent of all women faculty work in 
lower-tier institutions, compared with 58 percent of men (Shul- 
man 1979). ^l.x over, women tend to have heavier teaching 
loads, regardU' f the type of Institution in which they are 
employed (Gapi and Uehling 1979). Furthermore, an academic 
department's productivity changes as funding or age affects the 
size of its faculty (Drew 1975). 

In their review of the literature on the measurement of 
icholarly work, Smith and Fiedler (1971) emphasize that no 
criterion measure now available is sufficiently well established 
to stand alone* They say that the measure that is least contam- 
inated by the prestige factor is citation count: the number of 
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times that a scholar's work is eited in the literature by other 
scholars. Citations to older research may be given greater weight 
than citations to more recent research: "According to this ration* 
ale, a scholar deserves extra credit if his 16=year old research is 
still worth quoting'^ (Smith and Fiedler 1971, p. 228). Nonethe- 
less, the authors note that citation measures still have flaws* At 
one extreme, significant research may not be recopiized for a 
long time; at the other, the research may be so well known that 
it is no longer cited by name. Nor it is always possible to dli- 
tinguish between original and secondary research,^ 

After examining the data from major studies of the corre- 
lates of scholarly output, Smith and Fiedler (1971) summarise 
their flndings as follows^ 

Quantity of publication is moderately related to Individual or 
departmental eminence, productivity and recognition are mod- 
erately related, and citation counta correlate well with recog- 
nition and indivldu&l eminence. The relationship between cita- 
tion counts and quantity of publication is less clear . . . [as is] 
the relationship between citation counts and depaii;mental 
preatige* . * * The data suggest that citation counts ihould be 
compared only within a given field, not between fields, (pp. 
232-3) 

Other quantifiable indicators of quality 

The Blackburn-Lingenfeltar litwature review (1973) describei 
and evaluates other quantifiable indicators that have been used 
to assess quality^ including measures of faculty achievement and 
other traits, student quality, institutional resources, program 
efficiency, client satisfaction and external viewpoints, and out- 
comes* Their comprehensive overview is discussed below. 

Faculty achievement and other traits— The achievement of fac- 
ulty, such as degrees and awards, offers another measure of de- 
partment qualiftr* In addition^ some researchers have looked at 
other faculty traits such as years of teaching experience and 
travel abroad, but these **are not as useful as measures more 
closely related to the actual productive work of the faculty, such 
as scholarly writing or the training of Ph,D/'s" (Blackburn and 
Lingenfelter 1978, p. 8). Crane (1966) pointe out that measures 
of the performance of individual faculty members, as opposed 
to averaged measures of the performance of all faculty mem- 



^See Bayer and Folger (1966) and Marfolis (1967) for a more de- 
tailed discuaslon of the citation index. 
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bera within a department* have the advantage of making in- 
dividual contributions more explicit. 

Blackburn and Lingenf alter are less critical of quantifiabli 
indicators based on faculty achievement than are some commen- 
tatorSj failing to note such possible sources of bias as the age of 
the individual faculty member and the size, age, or eminence of 
a particular department or institution. Moreover, the a%varding 
of recognition to individuals in academe may be unduly influ- 
enced by cronyism and by the eKjstence of old-boy networks. 

Student quality — ^Another approach to evaluating program qual- 
ity is to examine the quality of the students enrolled in the pro- 
gram. Recalling that Cartter (1966) used the distribution of 
Woodrow Wilson Fellows among graduate departments as one 
measure in validating his study, Blackburn and Lingenfelter 
caution that not all graduate students compete for such awards 
and that mistakes can occur in the selection process. When SoU 
mon (1976) ranked 60 institutions according to the total number 
of students with National Institutes cf Health (NIH) predoctoral 
and postdoctoral fellowships enrolled in 1969, he found dif= 
ferences in the enrollment patterns of male and female NIH 
fellowship awardees. These results suggest that sex, and per- 
haps racial /ethnic background, should be considered when stu- 
dent achievement is evaluated, as well as when other quantitative 
measures such as scholarly productivity are used to measure 
quality, at least until women and minorities become better rep- 
resented in academe. 

Blackburn and Lingenfelter raise the issue of cultural bias 
in the use of standardize tests to measure the ability of mi- 
nority-group members, "particularly since equalized opportunity 
is a desirable societal goal" (1973, p. 9), And, despite the equivo» 
cal evidence with respect to role models and student success 
(Astin 1968; Goldstein 1979), they suggest that "the absence or 
presence of minority group faculty as mentor-models must be 
considered In the assessment of a program'' (p. 9), 

While acknowledging that prestige may contaminate meas- 
ures of student quality in that good students are attracted to 
high-prestige programs, Blackburn and Lingenfelter nonetheless 
assert that "student quality can stand in its own right as a cri- 
terion of excellenci'* because "well-qualifled students are an es- 
sential element of an excellent program" (p, 8), They also point 
' out that the lack of uniformity in grading standards is a weak- 
ness when past Ecademic performance is used to measure student 
quality, but they ignore other aspects of the problem of student 
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input; The more capable the student upon entry into a graduate 
program, the more likely it is that the student^s iubsequent 
achievement will be high, whatever the quality of the profram. 
Thus, measures of the achievements of progTam graduates re- 
veal little about the educational effectivmiegs of a program un- 
less such input factors as ability, aspirations, motivation, and 
past accomplishments are taken into account 

Institutional resDurces— Library holdings are among the most 
common measures of institutional resources used in quality 
studies. Like Cartter (1966), Blackburn and Lingenfelter (1973) 
note the insufflciency of looking Just at the number of volumes 
In the library and the necessity of considering whether iibrary 
holdin^^s are comprehensive, up-to-date, and easily accessible. 
Further, they suggest that studies of doctorate education should 
evaluate the adequacy of laboratories, office space, computer cap- 
abilities, seminar rooms, and the like. They caution that, since 
required facilities differ by discipline, specialists in each field 
ought to make such evaluations. 

Program efficiency— Program efflciency at the graduate level is 
usually defined in terms of the number of doctorates produced 
per p^aduate faculty member or the number of doctoral students 
enrolled. Moreover, assessments of program efficiency usually 
involve some kind of cost-benefit analysis. Reviewing the ap- 
proaches used to assess doctorate quality by measuring program 
efficiency, Blackburn and Lingenfelter (1978) state that *'the 
ideal index of efficiency in Ph.D. production probably has not 
been devised'* (p. 15). They do, however, offer a list of items 
that should be included in such an index ^ 

1) Enrollnient data for students from the time of entry until 
the termination of their study (with or without a decree); 2) 
tabulations of individual and departmental activity relative to 
dissertation committees; 3) tabulation of undergraduate work 
loadi; and 4) tabulation of all instructional activities (semi- 
nars, directed readings, etc,) relative to doctoral education* 
(p. 15) 

Client satisfaction and external viewpoint&— As Blackburn and 
Lingenfelter note, ^'clients*' may refer to current students, grad- 
uatesj or the employers of graduates; all these groups have been 
surveyed as a means of evaluating quality in higher education. 
Harvey (1972) reviews the literature on the use of student opin- 
ion. In addition, some studies (Bess 1971; Hagstrom 1971) have 



looked at faculty morale and satisfncUon as indicators of pro- 
gram quality. 

Blackburn and Ligenfelter fail to mention the hostility that 
many academics manifest toward student evaluations, perhaps 
because they fear that alienated students may be unduly harsh in 
their judgments. There is also some fesling in the higher educa- 
tion community that surveys of student opinion may amount to 
little more than popularity contests. Clearly, although student 
evaluation has been incorporated in some surveys (e.g., Clark, 
Hartnett, and Baird 1976), the chief "consumers" of higher 
education have usually not been given an opportunity to make 
their vie%vs kno\vn. 

Colleges and universities have also been reluctant to give 
outside observers a voice in quality assessment. Hence, in dis- 
cussing the intrainstitutional approach to quality assessment, 
Blackburn and Lingenfelter (1973) maintain that, if external 
evaluators are used, care must be taken "not to vitiate" the 
eflfectiveness of confidential internal self-assessments in enhanc= 
ing self-evaluation, reducing defensiveness, and providing a 
"powerful impetus for improvement" (p. 18). 

Outcomes— The final group of quantifiable indicators, discussed 
briefly by Blackburn and Lingenfelter, includes outcomes such 
as the scholarly productivity of program graduates (a measure 
that is, of course, subject to the same constraints as measures of 
faculty productivity and their subsequent employment history) , 
which is usually treated by means of cost-benefit analysis* They 
mention two major difllculties in conducting cost-beneflt analyses 
in higher education: Lack of information on the actual costs and 
"exceedingly complex conceptual problems in establishing a valid 
measure of the social benefits of graduate education" (p. IS), 
Others have suggested the importance of considering supply and 
demand in the job market (Shichor 1970) as well as the cumula- 
tive effects of time on career patterns, achievement, income, and 
so forth over the life span (see Mincer 1970). Psacharopoulos 
(1975) reviews the literature concerning the relationship be- 
tween Institutional quality and subsequent income, asserting that 
this relationship rimalns obscure for several reasons. First, in 
most cases, the mm of the samples studied has been "relatively 
small or too specific for particular groups of people or educa- 
tional levels** (p, 88), Second, statements about the eflfect of 
institutional quality on earnings cannot always be established at 
a statistically significant level Third, the question of whither to 
use Independent measures of institutional quality and student 
ability or whether to "simply use an average measure of student 
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ability as a proxy for institutional quality" (p. 89) has not been 
resolved. These reasons would seem to parallel the challenges 
that continue to confront those concerned with measuring quality 
in higher educations To ascertain appropriate criteria and to 
quantify criteria so as to permit comparisons (Blackburn and 
Ungenfelter 1978). 

Sunmiary and concIusiDns 

The literature on the assessment of quality in higher education 
reveals consensus on a number of needs: 

□ Quality assessment must extend beyond the leading 20-to- 
80 institutions. 

□ Multiple indicators should be used, 

□ The opinions of consumers (current students, graduates, 
and the employers of graduates) should be incorporated in 

program ratings, 

□ Some attempt should be made to quantify the social and 
individual benefits of higher education, 

□ More attention should be paid to student learning and 
growth as the desired outcomes of higher education, 

□ Quantifiable indicators must assess adequacy as well as 
frequency or volume (e.g., library holdings, publications), 
n Different quantifiable indicators are relevant to different 
disciplines. Moreover, differences between the sexes and 
among different racial/ethnic groups should be taken into 
consideration. 

Further study Is needed to find assessment procedures ap- 
propriate to different program purposes and different educational 
levels. Further research is also needed on transitions from one 
educational level to another. Urgent questions about which 
measures of quality are relevant for program improvement and 
which for policy decisions still have to be answered. Since norma- 
tive data are necessary to compare programs with one another, 
it seems desirable to establish ongoing procedures for collecting 
quantiflable Information on physical facilities, faculty quality, 
and so forth. Most important perhaps, institutions must become 
more concerned about quality in terms of the development of 
their students and thus must extend evaluation to consider out- 
comes other than income, publications, and other easily quantifi- 
able but somewhat superficial considerations. How much weight 
colleges and universities should assign to the results of external, 
nationwide research—as compared with their o%vn internal as- 
sessments—remains an issue. 
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Quality Assessment at the Undeigraduate Level 



That academe has made considerably fewar attempts to assess 
quality at the undergraduate level than at the graduate level is 
not surprising: The scope, diversity, and multiplicity of func- 
tions that characterize the undergraduate domain make inean^ 
ingful comparisons among Institutions and programs difflcult 
On the other hand, since undergraduate enrollments far exceed 
graduate enrollments, valid multidimensional nieasures of under- 
graduate quality could benefit both potential students deciding 
on which college to attend and proapective employers choosing 
among the graduates of various undergraduate programs. . 

This chapter first reviews some of the more traditional 
studies of undergraduate quality, discusses the Gourman ratings 
(probably the moat well-known but at the same time highly 
questionable ratings of undergraduate education), and then de- 
scribes an example of the popular college guides offered by com- 
mercial publishing houses. The final section discusses the input- 
environment-outcome model for assessing the quality of under- 
graduate education, an approach that seems especially promis- 
ing. 

Traditional academic studies 

A number of traditional studies rating undergraduate education 
have demonstrated that colleges differ greatly In their resources 
(with that term encompassing a multiplicity of factors, human 
as well as financial). Thus, in a study ranking 119 undergraduate 
institutions on the basis of multiple weighted quantitative indi- 
cators and then comparing each institution's **quality IndiK" 
score with its library resources, Jordan (1963) found that high- 
scoring institutions have more library volumes per student and 
spend more on salaries for library staff than do low-scoring In- 
stitutions. Moreover, without identifying specific undergraduate 
schools, Brown (1967) grouped colleges on the basis of eight 
factors: (1) proportion of faculty with the doctorate; (2) aver- 
age compensation (salary and fringe benefits) per faculty mem- 
ber; (3) proportion of students continuing to graduate school; 
(4) proportion of graduate students; (5) number of volumes in 
library per full-time student; (6) total number of fulUtime 
faculty; (7) faculty-student ratio ; and (8) total current incomt 
per student These factors are similar to those used to evaluate 
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graduate and professional programs and thus fail to take into 
account the special nature of the undergraduate experience. 

Astin (1977a) confirms that %vealth and resources are un- 
equally distributed among undergraduate institutions, especially 
in terms of their enrollment of highly able students. He empha- 
sises that, in light of the differing admissions policies of dif- 
ferent types of institutions* equal opportunity and equal access 
—though "among the most popular cliches in the contemporary 
jargon of postsecondary education'' (p. 8)— may be more myth 
than reality. 

Other traditional studies rate undergraduate institutions on 
the basis of student achievements. For instance, Krause and 
Krause (1970) rank colleges according to the number of their 
baccalaureate graduates who contributed articles to Scientific 
Amencaji between 1962 and 1967. Although the authors credit 
the '^potency of small colleges in producing scientists" (p. 134), 
when one looks at the large schools mentioned in their study, 
one is forced to conclude that Scientific American may be a less 
scholarly publication than others in which the baccalaureate 
graduates of larger institutions might publish. Further, gradu- 
ate study is more appropriately associated with publication than 
is undergraduate study. Had the results been adjusted to con- 
sider graduate school origins rather than baccalaureate origins, 
the list of larger institutions might well have changed. 

Dube (1974) ranks 100 undergraduate institutions accord- 
ing to the total number of their alumni who entered medical 
schools in 1978-74. The purposes of the study are not made clear, 
and the result is a unidimensional, purely statistical portrait; 
although the absolute ranks might fluctuate from year to year, 
the same group of institutions would probably emerge at the 
top in any subsequent rankings. Similarly, Tidball and Kistia- 
kowsi (1976) rank institutions according to the proportions of 
their baccalaureate graduates who go on to earn doctorates. The 
same criticism can be applied to both studies: The criterion 
used is irrelevant for many colleges that do not emphasise pre- 
paration for graduate or professional school as the fundamental 
purpose of undergraduate education. Moreover, the extent of 
self^selection among "achievers*' may be such that their subse- 
quent success can be attributed more to their own abilities and 
aspirations than to the impact of the eollege experience (see 
Anderson 1977; Astin 1963). 

In a series of studies, Astin (196Ba, 1971; Astin and Hen- 
son 1977) has developed a systematic, replicable measure of one 
aspect of quality in undergraduate education^ — the selectivity 
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indM, an estimate of the average academic ability of an institu- 
tion's entering freahmen. The most recent update of the selec- 
tivity index (Astin and Henaon 1977) uses the SAT and ACT 
scores of 1973 entering freshmen to estimate the selectivity of 
all accredited two- and four-year colleges and universities* The 
authors state that 

educators have a kien interest in aelectivity because the folk- 
lore of higher education suggests that the more selective in- 
stitution has higher academic standards than the less selective 
inititution and, by impliaationj a hig her quality education pro- 
gram. Both faculty and administrators are inclined to view the 
average test scores of their entering freshmen as an indes of 
institutional worth. Regardless of the validity of such views, 
ample evidence suggests that an institution's selectivity is a 
good measure of its pBrceived quality, (pp. 1-2) 

In short, the more able the student body, the more likely it is 
that the institution will be perceived as of high quality. 

The validity of the latest selectivity index Is supported by its 
correlations with selected institutional characteristics such as 
tuition and student-faculty ratio. This indeK has been used not 
only to rank undergraduate institutions (Astin and Solmon 
1979; "Most Selective Institutions of Higher Education/* 1978) 
but also as an independent variable in a longitudinal study of 
college impact, discussed in greater detail below (Astin 197Tb). 

Another way of looking at undergraduate quality is exem 
plifled by studies that examine the college preferences of highly 
able students (Astin 196Ba; Astin and SolmoB 1979; Nichola 
1966) or of students froni specific regions and in specific major 
fields (Astin and Solmon 1979). Astin and Solmon note that 
"while a popularity measure does not necessarily reflect the 
average level of academic talent in the student body (i.e,, selec- 
tivity), It does provide a measure of the institution's drawing 
power among very bright students" (p, 49), Of course, to assert 
that the college preferences of highly able students are an indi- 
cation of the perceived quality of an institution Is not the same 
as asserting that a highly selective institution does a good job 
of educating its students p Institutional popularity does, however, 
reveal how much choice exists for^ — and from— an applicant pooh 
The authors conjecture that the relative stability over time in the 
college preferences of h hly able students is attributable to the 
existence of a kind of folklore about higher education quality 
and that "measures of selectivity and popularity ... are simply 
a reflection of the students' ultimate acceptance of this folklore" 
(Aitin and Solmon 1979, p. 60). 

Finally, using proportionate numbers of raters from 20 per- 



40 



cent of all U,S* colleggs and univiriltlis, Cfiange magazine 
(Johnion 1978a) conducted a reputational study, based on rater 
recall, to investigate three ienies of the term "leadership*" The 
study found that raters from all typea of institutions agree on a 
list of those instltutioni "leading" in national influence, and, 
in fact, conflrm "the traditional cluster" of top-rated institutions 
identifled in previous graduate and profesiional rankings* They 
found, however, that when asked to list those Institutions that 
were leaders in the sense of being innovators, "reipondents from 
all types of four-year inititutions cited liberal arts colleges more 
than other types of Institutions"— especially those liberal arts 
colleges with highly iilectlve admissions policies and those that 
are the leading producers of baccalaureate graduates who go on 
to the get the doctorate— whereas two-year college respondents 
"mostly cite community colleges" (p, 51), 

The Johnson study also examined tiie results with-fispect to 
geographic proximity. According to this analysis, even though 
institutions of all types are unlikely to cite institutions within 
their own state when asked to name national leaders or innova- 
tors, they art likely to mention sudi institutions as having a 
major influence on their own programs, especially those institu- 
tions belonging to the same Carnegie category. Further, com- 
munity colleges are most likely to manifest this regionalism. 
The Change study calls Into serious question, then, the utility of 
the simple rank orderings reported in past reputation surveys. 
It suggests that broader issues of education involving undergrad- 
uate as well as graduate programs ought to be carefully con- 
sidered whenever institutions are ranked, Johnson ( 1978a) 
states, "the structure of American higher education is far too 
complex to be understood in relation to any single academic 
procession" (p. 51). 

Indeed, another means of rating, and essentially ranking, 
almost the entire higher education system is the Carnegie classi- 
flcation (Carnepe Council on Policy Studies in Higher Educa- 
tion, 1976) used by Johnson (1978a) and others. Despite the 
alleged objectivity with which institutions are listed in six broad 
categories, it is clear that the most prestigious are included in 
the subcategory "Research Universities I" for p'aduate study 
and "Liberal Arts I" for under^aduate study. 

In summary, traditional academic studies, whether reputa- 
tional or based on quantifiable indicators, tend to stand as dis- 
crete entities. Systematic investigation of how to measure qual- 
ity and of what quality means in the heterogeneous undergrad- 
uate domain has so far been lacking. 
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Tht Gourntan rating 



Though probably the best known of undergraduate ratlngSj the 
Gourman ratings (1967, 1977b) are Idiosyncratic and unreplic- 
able. Neither report fully explains the methodolo^ used to de= 
rive the ratings. What is revealed, howeverp may help to account 
for some of the odd results. 

In the 1067 ratings, 1,187 four-year colleges were scored on 
two sets of variables: strength of the institution's academic 
departments and quality of nondepartmental areas* The scores 
were expressed as letter grades corresponding to the College 
Board scale: A ^800, B^600, 0^400, D^200, Then, variable 
scores in each set were averaged to produce a numerical "aver- 
age academic departmental rating,^* and "average nondepart- 
mental ratings," and an overall "Gourman rating'' for each insti- 
tution* 

Although the Gourman index has been used as a basis for 
other studies (e*g,, Solmon 1975), many of Gourman's asser- 
tions are highly questionable. Thus, he rates "older** college 
faculties more highly than "younger*- ones on the grounds that 
"a minimum of ten years after college graduation is necessary 
to produce an excellent teacher in the classroom*- (p, xiil) but 
offers no evidence to substantiate this claim. Moreover, equal 
weight is ^ven to ratings of a college's alumni association, 
faculty effectiveness, public relations, library, and athletic-aca- 
demic balance, even though common sense suggests that these 
factors differ considerably in the magnitude of their contribu- 
tions to institutional quality. Finally, Gourman reveals a bias 
toward large institutions, tending to rate large public institu- 
tions more highly than smaller liberal arts colleges (Webster 
1979)* 

The 1977 Gourman ratings use a format identical to ttiat of 
the 1970 Roose-Andersen study of graduate programs to rank 
only 68 undergraduate programs, as well pre-medical and pre- 
law programs in the IJ*S*, and foreign/international universities 
and professional schools* Again, no infomatlon Is given as to 
how ranks and scores were derived, what factors were consid- 
ered, or how these factors were weighted. Supposedly, these 
methodological matters are dealt with in "supplemental reports** 
on institutions; however, no such reports have ever appiared. 

An esampli of a popular college guide 

In addition to academic studies of undergraduate quality, a num- 
ber of guides to undirgraduate colleges are available from com» 
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mercial publishing houses, Haum Comprehensive Guide to CoU 
legps (1978) being a good example, "Based on research data— 
not opinion" (p. xi), this publication rates almost eve^ two- 
and four-year college in the country. 

Perhaps the most interesting aspect of Hawes Guide lies in 
its implicit view of the purposes of undergraduate study and the 
miisioni of undergraduate institutions, a view clearly at odds 
with that of most educators and which raises a compelling ques- 
tion: Does Hawes Guide~Bnd others of its ilk— reflect what 
prospective students and their parents, as well as others outside 
the academic community, want from higher education? Con- 
eider, for example, the following criteria used in Hawes Guide 
as measures of undergraduate qualityii 

1, "Social prestige" ratings, based on the number of an in- 
stitution's graduates listed in the current edition of the Social 
Register. This information is given so that the prospective stu- 
dent may know "the extent to which the sons and daughters of 
America's upper class^ — ^Its richest, oldest, most socially promi- 
nent families — go to that college" (p. xi). 

2. "Social achievement" ratingij based on the number of an 
institution's ^aduates listed in the current edition of Who's 
Who in America. Supposedly this information indicates "how 
likely this college is to help a student achieve high status later 
in life largely through his or her own abilities and efforts'* 
(p, xll), 

8. Consumer ratings: Some institutions are labeled "best 
buy," "better buy," and "good buy.*' 

4. "Faculty salaries" ratings, said to be "one very basic indi- 
cator of the college's academic quality" In that "a college with 
high^ faculty salaries will In general attract more highly quail- 
fled professors" (p. xii). (Faculty salaries, however, are not ad- 
justed for geographic, or other cost-of-living dlflferences.) 

5. "Expense" ratings that Indicate "the level of dorm-stu- 
dent expenses" (p, xli), 

6. "Admissions" ratings : "hard," "selective," and "easy." 

The fallacy inherent in using mention in the Social Register 
Is readily apparent; many people are listed in thli publication by 
virtue of their parents' or their spouse's status. Moreover, such 
a criterion seems inappropriate for higher education in a demo- 



Utemi 3, S, and 6 in this list are not uaed in calculating the rank of 
instltutloni. 



43 



cratiG society, particularly in a period when concern over equal 
accees and affirmative action runs high. Similarly, mention In 
Who's Who is a questionable criterion, even though some re- 
searchers (e.g., Tidball 1973) also have asserted that such men- 
tion is related to the quality of one's undergraduate Institution. 
The primary difflculty is that this criterion confounds the im- 
pact of the undergraduate institution with the abilities and 
eflforts of the individual, who might very well achieve such men- 
tion whatever his or her undergraduate origins. NonethelesSi it 
is interesting to note that those institutioni that rank at the top 
in "social prestige" and "social achievement" tend to be the 
same institutions that rank at the top in reputational studies of 
graduate education* The most reasonable explanation for this 
correspondence is that those institutions that are most highly 
visible and prestigious are the same institutions that tend to 
attract affluent and highly able students. 

The input-environment-outcome model 

As has been suggested, the principal drawback to assesiing an 
undergraduate institution's quality on the basis of such factors 
as selectivity or alumni achievements Is that these factors tell 
us nothing about the contribution of the institution itself, That 
a highly selective institution tends to produce high-achieving 
graduates is not necessarily to the credit of the institution or its 
programs; these individuals might well have gone on to be high 
achievers whatever their undergraduate origins. Similarly, such 
institutional resources as highly credentialed and highly pro- 
ductive faculty, a comprehensive and up-to-date library whose 
materials are easily accessible to students, and superior labora- 
tory, computer, and classroom faciliWes should be regarded as 
indicators of quality only insofar as they can be proved to have 
desirable eflfects on the development of undergraduates. Most 
educators would surely agree that the chief purpose of under- 
graduate education is to bring about or to facilitate some kind 
of positive growth In students. Thus, assessing the degree to 
which different institutions, contribute to such growth provides 
a sound basis for comparing the quality of different under- 
graduate institutions. For this reason, the input-environment- 
outcome model represents the most promising approach to such 
quality assessment. 

In this model (see Astin and Panos 1969), input is deflned 
as what students bring with them to college: their prior knowl- 
edge, abilities, aspirations, and motivation, as well as such back- 
ground characteristics as sex, race/ ethnicity, and socioeconomic 
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status. The mvironmmt comprisei not only an institution'i edu- 
cational progranis but also its other resources (including extra- 
curricular activities) and characteristics to which students are 
exposfd. The outcome eomponent of the model can be described 
according to thrte dimensions, all of them involving changes in 
students (Astin 1974; 1977b) i 

1. Type of outcome: cognitive or Intellective changei (e.g.* 
in reasoning ability) versus noncognitive or affective changes 
(e»g., in values and attitudes)* 

2, Type of data (that is, the type of information used to as- 
sess cognitive and noncognitive outcomes): psychological data, 
which relate to "the internal states or 'traits' of the individual" 
(Astin 1977b) versus behavioral/sociological data, which relate 
to observable behavior. 

S. Time dimensions: long-term versus short=term effects. 
The relevance of any measure of input or of environment de- 
pends upon what outcomes are being evaluated (Astin 1974) . 

Moreover, the findings from a study of college students sev- 
eral years after their graduation (Solmon and Ochsner 1978) 
underscore the importance of distinguishing between the short- 
term and the long-term effects of college. Whereas Astin (1977b) 
found that student values tend to decline over the college years 
(in that smaller proportions of seniors than of freshmen rate 
as essential or very important a number of life goals), and that 
these declines are most marked with respect to status needs and 
business interests, Solmon and Ochsner reported that, several 
years after graduation, interest in certain life goals (e*g,, being 
very well-off flnancially) had once again increased, suggesting 
that "the effects of college on values and life goals do not endure 
long after graduaWon" (1978, p. 2). 

The particular utility of the input-environment-outcome mod- 
el is that it permits the researcher to apply statistical con- 
trol for student input variables and ttiui to assess the actual 
contributions of environmental variables (i.e., the college ex- 
perience) to the outcomes under consideration. Thus, the im- 
pact of different colleges and different college characteristics on 
student development can be isolated. 

There are several reabons why this model has not been used 
more widely in studies of academic quality. The first is a lack of 
consensus within the academic community on the proper goals 
and objectives (i.e., desired outcomes) of higher education. 
Second, even when goals and objectives are agreed upon, they are 
often stated in vague or abstract terms (e*g,, "to make the stu- 
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dent a well-rounded person," "to improve critical-thinking abil- 
ity") that are difficult to operntionalize. Third, the model 
requires a more sophisticated and elaborate methodology than is 
involved in (for instance) counting faculty publications or num- 
ber of volumes in the library. Most important, the design should 
be longitudinal ; that is, studenti must be surveyed at the time of 
college entry to determine their Input charncterirftlcH and then 
followed up at some point after exposure to the collep^e environ- 
ment (e.g*, two years, four years, ten years after college entry) 
to assess change. 

Nonetheless, the model has occasionally been promoted as a 
means to assess the graduate domain (Blackburn and Llngen- 
felter 1973; Clark, Hartnett. and Baird 1076; Conference Board 
of Associated Research Councils 1978). More frequently ^ it has 
been applied in studies of undergraduate education (Astin 
1965b, 1970, 1974, 1977b; Astin and Panos 1969, 1971) as a 
means of assessing **value added" by college attendance. 

Those who argue for the superiority of such a model do not 
always agree on the assignment of variables among the three 
components. For example, Blackburn and Lingenfelter (1973) 
regard "faculty characteristics" as an input variable, whereas 
Astin (1977b) and the Conference Board of Associated Re- 
search Councils (1978) consider this variable to belong to the 
environment component. Perhaps these differences are attribute 
able in part to inherent differences between graduate and under- 
graduate education: The goals and objectives of graduate edu- 
cation tend to be relatively clear-cut and more widely agreed to, 
whereas the goals of undergraduate education are more nu- 
merous and more diverse and thus require that careful theoreti- 
cal rationales be constructed prior to evaluation. Indeed, to 
circumvent the problems involved in specifying goal accomplish- 
ments in higher education, Cameron (1978) proposes a model 
for measuring the concept of "organiEatlonal effectiveness" 
using nine criteria* such an approach investigates the environ- 
ment of the system rather than its outcomes. 

Kerr (1978) reminds us that, in college, "what happens along 
the way is often more important than the purpose of the jour- 
ney" (p. 167)* The impact that an undergraduate institution has 
on the development of its students should surely be regarded as a 
fundamental measure of its quality. As more research is con- 
ducted on college impact using the input-environment-outcome 
model and focusing on student growth or "value added" as a 
major consensual goal (or, of necessity, specifying other goals 
and objectives so that the extent to which they are achieved may 



be asiesied), the meaning of quality in undergraduate education 
will Eiiume more appropriate scope and diversity than is pos- 
sible from traditional approaches borrowed from studies of the 
graduate domain. 

Summary and concluiions 

The academic community has conducted relatively few compara- 
tive aisessments of undergraduate programi. Moreover, because 
different criteria are used from one study to the next, the as- 
sessments that have been done have produced rankings that art 
not comparable. Unlike reputational rating studies in the grad' 
uate and professional domains, ranking studies at the under- 
graduate level do not produce identical lists. Perhaps because of 
its diversity, the under^aduate level inevitably assumes varied 
hierarchies according to the criteria used to rate it. In the ab- 
sence of consensus on the goals and objectives of undergraduate 
iducatlon, studies that focus on student change or "value added" 
and that apply the input-environment-outcome model (which 
also provides a useful framework for assigning quantifiable indi- 
cators to different components) may be most valuable. 

If comparisons among Under^aduate institutions must be 
made, different types of institutions (e,g,, two-year and four- 
year colleges) should probably be considered separately, and 
cognizance should be taken of the uneven distribution of higher 
education institutions within and among states. Moreover, al- 
though the point is rarely discusied. Institutional assessments 
may not adequately reflect the existence of especially strong— or 
weak^ — departments* On the other hand, "departmental quality" 
may represent too narrow a criterion for assessing undergrad- 
uate education, where itudenti are exposed to a broader range 
of disciplines than is true of graduate students and where other 
environmental characteristics may play a critical role in en- 
hancing or detracting from the undergraduate experience* 

Undergraduate education presents a challenge beyond that 
of quantifying or standardising criteria so as to permit com- 
parisons among programs and Institutions: That challenge is to 
find criteria appropriate to the size, heterogeneity, and multi- 
plicity of functions that the undergraduate experience encom- 
passes. 



Other Dimensions and Concerns in Quality Assessment 



Aiaessments of quality in American higher education can be di- 
icribed as having either an internal or an external focus. Ex- 
amples of the former include the ACE ratings, while the latter 
type of quality assessment comprises the accreditation process 
and the state program review prociiS. 

Thus far, the discussion has centered on the issue of quality 
in higher education as assessed, described, and critiqued in the 
research (i*e., academic) literature. These Internal assessments 
— e.g., the ACE surveys^ the Blau ratings, single-discipline re- 
views, and the correlational studies generated by all the pre- 
ceding—constitute a literature intended primarily for an aca- 
demic audience. To be sure, these studies are public documents, 
and some are reported in the mass media. Parents and pro- 
spective students, both undergraduate and graduate, may look 
through these materials in their efforti to find the "best" in- 
formation on which to base enrollment decisions. Nonetheless, 
these documents are of primary interest to academics and re- 
lata most to the "private life" of higher education (see Trow 
1976). The tendency to view these ratings as absolute or ulti- 
mate assessments of program and institutional quality, against 
the warnings of both the researchers and the critics, is likely to 
increase when such reporte are used by the general public. 

What, then, of external types of quality assessment? What is 
the nature of such activity? How does it differ from internal as- 
sessments? While the interest in, and the furor created by, the 
ACE and similar ratings during the heyday of postwar academic 
expansion seems to have subsided as higher education enters the 
"no-growth" era of the 1980s and 1990s, interest in accredita- 
tion (the oldest form of quality assessment) and state pro-am 
review (the newest form) is growing, spurred by two major 
trends i increasing governmintal concern about the flnancial ac- 
countability of higher education ; and increasing public concern 
about the outcomes or benefits of college attendance. Displace- 
ments in the Job market for college graduates, societal commit- 
ment to the goal of equal educational opportunity, institutional 
dependence on direct and indirect federal support (research 
^ants and student aid payments), and emphasis on consumer 
protection have all contributed to the current interest in ex- 
ternal assessments of quality in higher education. 
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Accriditation 

What is it about accreditation that assumes (or assures) inititu= 
tional quality and inspires the faith of college-bound studenti, 
their families, and government agencies? Even though accredita- 
tion standards are not widely understood by the general public, 
students and their parents look to accreditation as an indicator 
of inititutional quality and itability, and institutions respond to 
these concerns by listing their afflliations with various accredit- 
ing bodies in their promotional literature. Accreditation is, in 
most instances, a prerequisite for participation in federal aid 
programs, both for institutions and for students (that is, stu- 
dents must be enrolled in accredited institutions to receive fed- 
eral financial aid). Yet how strong is the relationship between 
accreditation and quality? And what are the attributes of in- 
stitutional quality as defined in the literature on accreditation? 

Accreditation and quality— The relation between quality and 
accreditation is made explicit in the statements of definition 
and purpose offered by experts in, and representatives of, tha 
field* Some examples of their views follow : 

□ Kenneth Young (1976a), president of the Council on Post- 
secondary Accreditation (COPA), the national nongovern- 
mental coordinating organization for accrediting agencies, 
says that "if accreditation can be defined in 25 words or less 
that definition would be : * Accreditation is a process that at- 
tempts to evaluate and incourage institutional quality' " (p* 
133). ^ 

□ According to Harcleroad and Dickey (1975), accrediting 
serves as "the major factor in quality control for our in- 
stitutions of higher education and for various professional 
and specialized programs" (p. 7). 

□ Patricia Thrash (1979), of the North Central Association, 
states that accreditation "provides an assurance of * * * edu- 
cational quality and integrity . , * to the educational com- 
munity, the general public, and other agencies and organiza- 
tions" (p, 116). 

□ The Advisory Committee on Accreditation and Institu- 
tional Eligibility (U*S, Department of Health, Education, 
and Welfare 1977), a federal advisory panel, asserts that the 
federal government uses accreditation as an eligibility cri- 
terion for participation in federal programs because accredi- 
tation provides "a reliable authority concerning the quality 
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of training offered by institutioni and programs*" (p. Hi* lee 
also Trivett 1976, pp. 8-19), 



Promoted as an attribute of institutional quality, accredita- 
tion — because it is essentially a binary process— may actually 
Impede true asssisments of institutional quality. Accreditation 
provides for an assessment of institutional performance against 
institutional objectives or against other (baseline) standards; 
and, operationally, an institution or program either is, or is not, 
accredited. In contrast, quality (like wealth, beauty, and wis* 
dom) exists on a continuum. 

While the accrediting community has been active in asserting 
the relation between quality and accreditation, it has been less 
precise in defining the actual attributes that make for institu- 
tional and program quality, probably because of the cherished 
diversity of the American higher education system (which does 
not lend itself to uniform operational definitions) as well as the 
consensual nature of the attributes of quality: We all (think 
we) know what quality is when we see it, but we have difflculty 
describing it for others. 

Accreditation's historical movement from quantitative to 
qualitative evaluation suggests that the accreditation process is 
primarily a criterion-referenced assessment,^ The regional as- 
sociations* self-study guides and accreditation documents de- 
scribe the accrediting process as the assessment of an institution 
in terms of its stated purposes and objectives. Yet some accredit- 
ing agencies currently do provide quantitative guidelines, and 
many are Indeed interested in quantifiable data that help to de- 
scribe Institutional attributes and resources (Petersen 1978) • 
The ambiguity of some of the criteria would appear to give ac- 
crediting agencies flexibility with respect to enforcing stand- 
ards ; the diversity of the American system of higher iducation 
would appear to require it. 

Accrediting criteria— TOe regional and professional associations, 
whose basic task is to insure that minimal standards are opera- 
tionalized, have articulated certain principles and criteria, often 
referred to as standards or guidelines, which are promoted to be 
attributes of institutional and program excellence or quality. 

Reviewing the published standards and guidelines of both 
regional and profesiional associations, Petersen (1078) con- 



1 Readers Interested in the history of accreditation are raf erred to 
Dickey and Miller (1972), Selden (1960), and Harcleroad (forth- 
coming) . 
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dudes that "there is such a wide variety [of standards] among 
agencies that almost any blanket conclusion or generaliEation is 
suspect" (p. 306). Harris (1978) offers a somewhat different 
opinion. In a report prepared for COPA, he identifiis seven cri- 
teria as being critical characteristics of an "accreditable" in- 
stitution.2 

1# Goals and objectives^ Because institutions are evaluated 
on the basis of their own purposes rather than by external 
standards, they must have expliciti comprehensive, and con- 
sistent goals and objectives that are subject to periodic review 
and revision. 

2. Governance, leadership, and structure: A basic premise of 
accreditation is that faculty possessing proper credentials will 
be aignificantly involved in designing curricula, setting gradua- 
tion requirements, and evaluating students; faculty, therefore, 
will maintain academic standards because an appropriate struc- 
ture of academic and administrative checks and balances exists 
to monitor effectively the institution with respect to its purposes, 
programs, currieular planning^ and degree requirements. 

3* Validity of degrees: Student achievement is commensu- 
rate with the general meaning of degrees awarded, and the In- 
stitution has a systematic means to assure that students meet 
the letter and the spirit of degree requirements* 

4, Adequate resources: Adequate human, physical, and flscal 
resources, as judged by academic peers, exist to accomplish 
stated goals and objectives. 

5, Stability: ThQ prevailing values of the academy are best 
represented by institutions that display evidence of stability and 
permanence. 

6, Students and programs: Student needs, interests, and as- 
pirations are reflected in institutional programs, and those 
services logically related both to the Institutional mission and to 
student needs are provided, 

7, Integrity: Institutional integrity is reflected in explicit 
goals and objectives; full disclosure of codes, rules, and prac- 
tices; sound fiscal management; ethical recruitment and pro- 
motion practices; consistent application of institutional codes; 



^Harris (1979) focuses on **accreditable" instead of '*good" because 
the former term is the *'mofe operational adjective,*' and because of 
the membership component in the accreditation process i i.e*, "ac- 
creditation means that an institution makes Itself amenable to the cri- 
teria and the procedures of the association in which it seeks member- 
ship" (p. 68). 
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and continutd monitoring and self-aisissment of institutional 
behavior and practices against stated goals and objecti%^es. 

Harris (1978) suggests that accreditation policies reflect 
"the eonventional wisdom of the academy [at any point in his- 
tory] about quality" (p. 62). Yet current developments— such 
as nDntraditional education, the increasing signiflcance of ac- 
creditation In the quest for federal dollars, and the shift, at all 
degree levels, from a seller's to a buyer's market — pose a number 
of challenges to the "conventional wisdom" regarding quality 
and accreditation, 

Troutt's (1979) textual analysis of the published criteria of 
the six regional accrediting associations reveals Ave criteria that 
"claim some association with quality assurance, . . . Most re- 
gional associations suggest a relationship between institutional 
quality and criteria for: (1) institutional purposes and ob- 
jectives; (2) educational programs; (3) financial resources ; (4) 
faculty; and (5) library/ learning resources" (p* 200). Troutt 
identifies three basic assumptions underlying the criteria that 
the regional asiociations promote as being related to institutional 
quality* First, judgments about quality should be based on in= 
ferences from specific conditions rather than on a direct evalua- 
tion of student performance^ Second, no common benchmarks 
^lit for measuring institutional quality. Finally, acereditatlon 
criteria equate higher education with a production process. These 
three assumptions contrast sharply with those of educational re- 
searchers (e.g,, Astin 1977b; Dressel 1978) who assert that 
quality judgments should be based on an assessment of student 
outcomeSj that common benchmarks do exist, and that the pro- 
duction model is neither the only, nor the best, model for de- 
scribing higher education (see Clark et aL 1972; and Walsh 
1973). 

Graduate program accreditation. In contrast to general in- 
stitutional accreditation as coordinated by the regional associa- 
tions, is somewhat more speciflc about the attributes of program 
quality. Graduate education is seemingly a more sacred bastion 
than undergraduate education, Anderson (1978) observes that 
while the "higher education establishment could tolerate wide 
diversity and lesser quality in undergraduate programs and even 
at the master-s level , . , it registers deep concern when the 
quality of the doctorate is diluted" (p. 279), Andrews (1978) 
asserts that there is an inverse relationship between enroU- 
menti by degree level and concern for program quality in higher 
educations Graduate and professional programs, which enroll 
the smallest number of students, have historically been the focus 
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of the dibates on quality, whilt lower«division, undergraduats, 
and vocational tducation have generally received little attention 
in such discuiiioni. Our survey of the literature conflrmi this 
contention: Articles and documents on graduate education and 
graduate rankings outnumber those on undergraduate programs 
by a ratio of roughly six to one. 

State prograin review 

The state role in higher education has changed coniiderably 
during the last 16 years: from passive purveyor to concerned 
underwriter. Similarly, the role of the program review process 
nas changed in response to a number of recent developments: 
inertased flnancial and political pressures for the efflcient use of 
resources, the proliferation of de^ee programs at all levels, the 
shrinking job market for degree=holders. Although other pur- 
poses are attributed to the review process (e.g., to eliminate un- 
necessary program duplication, to assure quality), the term "ac- 
countability" not only best describes its rationale but also sub- 
sumes the other purposes attributed to it (Barak and Berdahl 

The state perspective on quality— To understand the place of 
program quality assessment in the state review process, one 
must first be aware of the historical state perspective on quality. 

Like the higher education community in general, the state 
would seem to have rather traditional notions about institutional 
and program quality (see Halstead 1974, chapter 6). During the 
postwar period of rapid i^owth in higher education, the states 
viewed quality as manifesting itself primarily in criteria estab- 
lished by the academy: i,e., students with high test scores and 
faculty with doctorates, research grants, and publications. These 
are the attributes of quality that receive most attention in the 
literature of the period (e.g,, Berelson 1960 i Committee of Fif- 
teen 1955), The states purchased (or created) higher education 
facilities for the benefit of their citizens, and the states '^bought" 
the value system of the academic community. The emphasis on 
student and faculty credentials as attributes of institutional 
quality was a response to market factors during the late 19S0s 
and most of the 1960s, when *'high-quality" students and faculty 
were in short supply, 

Halstead (1974) indicates that state planning agencies have 
generally accepted responsibility for providing the leadership to 
improve the quality of public higher education. Until very re- 
cently, however, the states viewed program quality as depending 
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almost entirely on the input charaGterlatics of studenti and 
faculty. The traditional state perspective Is exemplifled in the 
following statement from the 1B60 California Master Plan for 
Higher Education: 

The quality of an institution and that of a iystem of higher 
education are determined to a cpnaiderable extent by the abili- 
ties of those it admits and retaina as atudents. This appliei at 
all levels— lower division, upper division, and graduate. It is 
also true for all iegments* but the emphases are different. The 
junior colleges are required by law to accept all high school 
graduates (and even some nongraduates under some circum- 
stances); therefore the Junior colleges must protect their 
quality by applying retention standards rigid enough to guar- 
antee that taKpayers' money is not wasted on individuals who 
lack the capacity or the will to succeed in their studies* If the 
state college and the university have real differences of func- 
tion between them, they should be exacting (in contrast to pub- 
lic higher education in most other states) because the junior 
colleges relieve them of the burden of doing remedial work. 
Both have the heavy obligation to the state to restrict the 
privilege of entering and remaining to those who are well 
above average in the college-age group (California State De- 
partment of Education 1960, p. 66) . 

As this statement makis clear, the CaHfornla master plan is 
based on a meritocratic model of program and institutional ex- 
cellence in that It provides greater resources and opportunities 
for the academically endowed while regarding those studints 
**who lack the capacity or the will to succeed'' as antithetical to 
institutional quality. 

More recently, the states have moved beyond this perspective, 
expanding their focus to include educational process (e,g,, the 
provision of educational services, the impact of educational ex- 
periences) as well as student and faculty input characteristics as 
manifestations of quality. This shift in perspective is in large 
part a response to the demands from a number of constituencies 
for an accounting of (1) the resources allocated to public post- 
secondary education and (2) the availability and distribution of 
educational opportunities and beneflts to various clienteles (see 
Callan 1978). 

Kerr (1978) was among the first to describe the need/access 
versus quality/ excellence debate (which centers on the avail- 
abllitar and distribution of iducatlonal opportunities and bene- 
flts) from the itandpoint of the states, warning that they would 
find it difflcult to satisfy the academic communitar's heightened 
expectations for prop^am expansion and quality improvement 
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and at the same time accommodate the increasing number of 
high school graduates (and returning adults) with degree aspi- 
rations. Indeed, the state interest in, and responsibility for, edu- 
cational access and opportunity at all degree levels may well 
conflict with traditional notions of academic quality* 

State program review criteria^ — From 1960 through 1976, not 
only the number of state agencies but also their capacity to con- 
duct academic program reviews increased signiflcantly. Using 
data from the U.S. Office of Education (Martorana and Hollis 
1969) and the Education Commission of the States (1975b), 
Barak and Berdahl (1978) document a 105-percent increase 
(from 19 to 39) in the number of states with higher education 
coordinating or governing boards that have program review 
authority over both new and existing prograns. The number of 
states with governing or coordinating agencies that have au- 
thority only to approve new program proposals also increased 
over this period, from four to eight. Barak and Berdahl note, 
however, that an agency with legal review authority may not, 
for a number of reasons, exercise that authority, whereas in 
some states a review authority that does not exist in law may be 
exercised by other means and by other agencies (e.g., legislative 
budget reviews). 

Developing appropriate criteria for program reviews has not 
been easy i The process is as political and volatile as any activity 
inside or outside the academy (Barak and Berdahl 1978; Hill 
1978; Hill et aL.1979; Mingle 1978). The Task Force on Grad^ 
uate Education of the Education Commission of the States 
(1975a) recommends ten factors to be considered in the pro- 
gram review process. These factors are listed below; the figures 
in parentheses indicate the number of states— among the 27 
that currently conduct some sort of program review or that have 
established procedures for review— that use the criterion' 

1. Number of program graduates in each of the five precede 
ing years (15) 

2. Student enrollment (matriculation and retention) (12) 
3* Size of classes and cost of core courses (6) 

4. Cost per program graduate (i.e., per degree awarded) 
(9) 

5. Faculty workload (2) 

6. Program quality, as reflected in (a) reputation, (b) 
faculty qualifications, and (c) the employment experience 
of program graduates (8) 

7. Comparative analysis of the production of program 



graduates from similar types of programs in the state, 
the region, and the nation (8) 

8. Economiii or improvements in quality to be achieved 
through progTam consolidation or elimination (3) 

9. General student interest and demand trends (10) 

10. Appropriateness of the program^ given the institutional 
mission (10) 

The most frequently used criteria are measures of pro= 
ductivity, costs, and the compatability between progTam and in« 
stitutional mission (Barak and Berdahl 1978). 

Barak and Berdahl (1978) identify nine states that consider 
proi^am quality in the rgview process. It is interesting, but not 
surprising, that the states generally have not developed new 
procedures for assessing pro-am quality; rather, they tend to 
adopt the procedures and criteria articulated by accrediting 
agencies and educational researchers: e.g., student character- 
istics, faculty qualiflcations and research productivity, and peer 
review. Review procedure guidelines, agency policy statements, 
and evaluation committee reports indicate the extent to which 
traditional measures of quality have been accepted in the state 
program review process. For example, a 1978 policy statement 
of the New York Regents proclaims that 

the attributes of [the] quality of a progrEm are widely known 
and accepted. Among these are the level of faculty research 
and scholarship; the eflfactiveness of and attention to teaching 
and counseling by the faculty; the caliber of students i the cali- 
ber of dissertations; the adequacy of laboratoi^, libraiy, and 
other related fscilitiei; the preience of supporting and related 
programs. (Regents of the University of the State of New 
York 1978, pp. 17^§) 

Following the lead of the Regents, the report of the New 
York Chemistry Program Evaluation Committee itatei that the 
factors "of central importance to the committee" ai measurts 
of program quality were "the quality of the faculty, the research 
interests of the faculty, and the quality of the students (State of 
New York, Chemistry Program Evaluation Committee, n.d., p, 
6). 

While a number of states consider productivity factors in 
the review process, only the Florida Regents see productivity as 
being directly related to program quality: 

It would be imposiible to conduct a thorough investigation of 
every program every year. The use of degree productivity ai a 
means of identifying programs to be evaluated rests on the as- 
sumption that with the exception of professional programs 
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iuch as medicine and law* degree productivity ia the belt 
single index which correlates meaningfully with the enroll- 
ments of majors in the program, student demand, [the] job 
market for graduateSi and [the] quality of the program. 
(Florida Board of Regents, March 8, 1974; cited by Barak and 
Berdahl 1978. pp. 62-^3) 

This view Is indeed unusuaL^ Although productivity Issues are a 
concern in the review process^ few states have broken new 
ground by eKpanding the conceptualization of quality criteria in 
the manner articulated by the Florida Regents. In sun^mary* 
most states continue to view quality from the traditional per- 
spective, focusing on process variables (e.g., facultyp Institution^ 
resources) and Input variables (e.g.p student characteristics). 

Sumxnary and conclusions 

Historically, the states have financed higher education and left 
the issue of quality assessment and management to the academic 
institutions, as implemented by the accreditation process. In the 
past 15 years, public concern for excellence/ quality in higher 
education has been aflfected by a number of factors* (1) reduced 
demand for higher education ^ and the financial consequences, in- 
cluding accountability, which have accompanied it; (2) federal 
incentives, such as the 1202 legislation, which promotes state- 
wide and regional planning and coordination; (3) the postwar 
transition of higher education from option/opportunity to en- 
titlement, formalized by the Basic Grants (BEOG) legislation of 
the Educational Amendments of 1972; (4) Increasing concern 
for consumer protection; and (5) growing emphasis on the out- 
comes and benefits of college attendance, stimulated by the equal 
opportunity concerns and the job market displacements of the 
1970s. These factors, and others, have served to focus renewed 
attention on accreditation and new attention on state program 
reviews as "public" or external assessments of quality in Ameri- 
can higher education. 

Accreditation has two characteristics that distinguish it from 
other forms of quality assessments. First, accreditation focuses 
on an Institution's capacity to achieve, and the extent to which 
an institution does achieve, articulated goals and objectives. 
Second, accreditation assessments are not competitive^ i,e„ in- 



STha use of productivity measures as a yardgtick for assesiing prO' 
gram quality and as a vehicle for identifying those program requiring 
more comprehensive review proved to be controveraial and has since 
been modified. 
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stitutions are not compared and ranked. State prcgram reviews, 
primarily concerned with resource allocation, are unique in that 
they address issues pertaining to finances, access and oppor- 
tunity, service to client populations and to the commonweal, and 
productivity. Taken together, accreditation and prograij reviews 
add other dimensions and other concerns— public concerns— to 
the discussion of quality in higher education. 
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Conclusions 



The previous chapters have reviewed the literature 'd illuminate 
the question: What is quality in American higher education? It 
would appear that the deflnition of quality varies with the con= 
text, depending on who is doing the assessment, by what means, 
and for what purpose. 

In academic studies, usually conducted by researchers from 
the higher education community, assessments have focused on 
identifying the ''best'- Institutions (or graduate departments). 
Whether based on peer review or on the application of a set of 
traditionally-used quantiflable indicators (which generally cor- 
relate highly with each other and with peer ratings), such as- 
sessments simply ignore about 99 percent of the institutions that 
constitute the nation's higher education enterprise. Moreover, as 
critics have pointed out* these rankings serve to reinforce the 
hierarchical structure of the system* in that those few institu- 
tions, departments, or professional schools at the top of the 
pyramid continue to capture what may be more than their^fair 
share of scarce resources (including highly able students). Thus, 
their prestige is further enhanced, while the incentive to im- 
prove their educational programs may be reduced. 

From this review of the quality literature, certain conclusions 
emerge as to how quality in higher education might be better 
defined and how methods of assessing quality might be im- 
proved. 

First, quality assessments must be referenced to depart- 
mental or institutional goals and objectives. Although academics 
may find it hard to agree on the proper goals and objectives of 
higher education, some specification of desired outcomes is re- 
quired as a first step in the assessment process. 

Second, the diversity of American higher education must be 
recognised and accepted rather than (as is too often the case) 
simply paid lip service. Different institutions and programs serve 
diflferent constituencies and have different goals and objectives. 
To measure them all by the same yardstick is to do a disservice 
not only to the higher education system but also to prospective 
students and to the public as a whole. Rather, those concerned 
with nsse.^sing quality must be more flexible, more willing to try 
a varii of quality measures or criteria that may be appropriate 
to different types of institutions and programs at different 
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levels* At the eame time, they must make clear Just what criteria 
are being uaed^ e.g., student quality, institutional resources, 
faculty productivity, the learning environment. 

Third, and closely related to the second point, new criteria 
ahould be incorporated in assessmenta of the higher education 
system: for example, student satisfaction with the educational 
experience; faculty satisfaction with the academic climate; em^ 
ployer iatisfaction with graduates; access and retention; serv- 
ices and benefits to the local community or the state. At the 
same time, it should be recognized that the importance of these 
criteria may vary by discipline, educational level, and type of 
institution. 

Fourth, quality assessments should give less emphasis to 
simply labeling programs and institutions (e.g., '*the best," 
"good," "marginar- ) and more to pointing the way to improve- 
ment. Stronger efforts should be made to identify the special 
strengths and weaknesses of particular pro-ams and institu- 
tions. 

Fifth, quality assessment should be dynamic rather than 
static, taking into consideration not only where a program or 
institution is now but also where It has come from and where it 
has the potential to go in the future. Related to this point, in- 
stitutional officials, aided by accrediting agencies, and by the 
states, have a responsibility to develop viable Implementation 
plans to assist this kind of long-term institutional development. 

Sixth, more attention should be paid to the "value-added" 
concept of higher education. To give an institution or a program 
high marks for the resources it is able to attract, without regard 
for what it does with those resources, is surely to overlook the 
whole purpose of education at any level : to bring about certain 
desired changes in students. Before we can judge how well an 
institution does with and by its students, we must know what 
the students were like at college entry. The input-environment- 
outcome model is a conceptual tool whereby the eharacteristics 
of entering students can be taken into account to arrive at an 
assessment of the impact of the college experience itself. Such 
an approach is especially necessary in a period when the twin 
doctrines of entitlement and equal educational opportunity are 
eipoused as worthy social goals. 

Seventh, failure to address the teaching-learning function 
perhaps represents the greatest weakness of quality assessments 
of American higher education. 
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Ovsr a decade ago, Allan Cartter stated the challinge to 
which we have only begun to respond adequately: 



Divaraty can be a costly luxuiy ii it is accompanied by igno- 
rance. Our preient system works fairly well becauie most stu= 
dente, pareiiti, and proipective employers know that a bache- 
lor's degree from Harvardi Stanford, Swarthmorei or Keed is 
ordinarily a better indication of ability and accomplishment 
than a bachelor's degree from Melroie A&M or Siwash GQllege. 
Even if no fersial studies were ever undertaken, there is al* 
ways a grapevine at work to supply impressionistic valuations. 
Howeveri evaluation by rumor and word of mouth is far from 
satisfacto^ * * . Just as consumer knowledge and honest adver- 
tising ai^ requisite if a competitive economy is to work satis- 
factorily, so an improved knowledge of opportunities and of 
quality is desirable if a diverse educational system fs to work 
effectively. 

Evaluation of quality in education, at both the undergradu- 
ate and graduate levels, is important not only in determining 
the front-ranking institutions, but also in identifying lower- 
ranking colleges. Many prospective graduate^ studente would 
not be suited to an education at Harvard, the Rockefeller In= 
stitutej or California Institute of Technology* Other institu- 
tions, in view of their educational offeringSj level of work, and 
qualily of studentSi would provide a happier and more pro- 
ductive experience. UniversitieSg through their selection pro- 
cedures, and students, though their natural proclivitieSi tend 
to sort themselves out into congenial environments. (Cartter 
1966, p. 3). 

In the expansionist era of the WBQb, the nation could afford 
to support its many colleges and universitiiij with their mul- 
tiplicity of programSj without looking too closely at the contri- 
butions they made toward the achievement of desired goals. 
Now, however, as the college-age population declines in number, 
as inflation continues to erode flnancial resources, as the value 
of higher education conies to be questioned, and as a number of 
institutions, both public and private, struggle to survive in the 
changing climate of the 1980s and 1990s, hard decisions will 
have to be made about what should be retained, what alttred, 
and what eliminated in our current pluralistic system. Thus, the 
need to deflne quality in mianingful waysg and to find better 
means of assessing it, is imperative. Such efforte should be 
i^ounded in a commitment to the diversity of the system, an 
understanding that, if higher education is to serve the needs oi 
a hftorogeneoua population, diversity is much more than Just a 
"costly luxury," 
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